Kaseya

IT management and cybersecurity software

SiteReliabilityEngineer

CA$115–130k Markhams, California, United States

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“Site Reliability Engineer at Kaseya. Skills: Site Reliability Engineering (SRE), AWS, Infrastructure as Code (IaC), SLOs, SLIs, Error budgets, Incident Response, Observability, Automation. Set, monitor, and enforce SLOs, SLIs, and error budgets. Lead incident response, troubleshooting, and blameless postmortems”

Industry & Context.

IT management and cybersecurity software

Problems you'll solve

Troubleshooting; Engineering solutions that scale

Eligibility Requirements

Active on call rotation

What They're Looking For.

Must Have

4 to 5 years of AWS production experience, IaC ownership with Terraform or CloudFormation, including state management, AWS ECS production experience (or Kubernetes background willing to ramp), Active on call rotation with incidents led and postmortems written, Working fluency with SLOs, SLIs, and error budgets in production

Nice to Have

Kubernetes production experience, Broader observability tooling (Datadog, Dynatrace, CloudWatch, Elasticsearch/Kibana), Chaos engineering, AWS Lambda or serverless workloads, Ansible, Chef, or Puppet, DevSecOps work (vulnerability scanning, secrets management, SOC2 or ISO 27001), Production database support (RDS, PostgreSQL, MySQL), Open source contributions or public technical portfolio

What You'll Do.

Lead incident response

and blameless postmortems

Build and maintain automated deployment

configuration management

and infrastructure provisioning

Manage cloud and hybrid infrastructure with Terraform or CloudFormation

Improve observability across systems through proactive monitoring

Partner with development teams to bake reliability into the SDLC

Cut operational toil through automation

systems that recover themselves

and engineering solutions that scale

Support containerized and serverless workloads

and observability practices

How You'll Work.

Team & Collaboration

Partner with development teams to bake reliability into the SDLC

Full Job Description

About Kaseya Kaseya is the leading provider of AI-powered IT management and cybersecurity software, serving Managed Service Providers (MSPs) and internal IT organizations worldwide. Our comprehensive platform helps organizations efficiently manage, secure, and automate their IT environments, driving operational efficiency and long-term business success. Backed by Insight Partners, a leading global software investor, Kaseya has experienced sustained double-digit growth and continues to expand its global footprint. Today, Kaseya supports customers in more than 20 countries and manages over 15 million endpoints worldwide. Founded in 2000, Kaseya has built a culture centered around innovation, accountability, and results. We are a high-growth, high-performance organization that values individuals who are driven, adaptable, and committed to delivering exceptional outcomes for our customers and teammates alike. At Kaseya, success comes from embracing challenges, moving with urgency, and continuously raising the bar. Kaseya is hiring a Site Reliability Engineer to keep our production systems healthy as we scale. You'll own the reliability of services that thousands of MSPs depend on every day. That means defining the SLOs we hold ourselves to, leading incidents when they happen, and building the automation that keeps things stable as we ship. The work is hands on, the on call rotation is real, and the environment runs heavily on AWS. If you treat reliability as a product instead of a chore, you'll fit in well here. What You'll Do Set, monitor, and enforce SLOs, SLIs, and error budgets that keep our systems reliable Lead incident response, troubleshooting, and blameless postmortems that produce real fixes Build and maintain automated deployment, configuration management, and infrastructure provisioning using Infrastructure as Code Manage cloud and hybrid infrastructure with Terraform or CloudFormation, balancing cost, scalability, and resilience Improve observability across

Free ATS check

Applying for this Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 90 detected · ranked by frequency

SLOs ×5

SLIs ×5

Chaos engineering ×5

Ansible ×5

Chef ×5

Puppet ×5

RDS ×5

PostgreSQL ×5

MySQL ×5

AWS ×4

Observability ×4

AWS Lambda ×4

Serverless workloads ×4

Vulnerability scanning ×4

Secrets management ×4

Site Reliability Engineering (SRE) ×3

Infrastructure as Code (IaC) ×3

Error budgets ×3

Incident Response ×3

AWS production experience ×3

IaC ownership with Terraform or CloudFormation ×3

State management ×3

AWS ECS production experience ×3

Kubernetes background ×3

On call rotation ×3

Incident leadership ×3

Postmortem writing ×3

Error budgets in production ×3

Kubernetes production experience ×3

Observability tooling ×3

DevSecOps work ×3

SOC2 compliance ×3

BEHAVIOURAL

AccountabilityAdaptableDrivenCommitted to delivering exceptional outcomesEmbracing challengesMoving with urgencyContinuously raising the bar

Role Details

Experience 4–5 yrs

Level Mid

Work Mode Onsite

Category infrastructure

Salary Band 100k-150k

AI-Extracted Insights

Domain Areas

it-managementcybersecurity-softwaremanaged-service-providers-mspsinternal-it-organizations

ANONYMOUS · UNFILTERED

What do employees actually say about Kaseya?

Real rants from real employees. Read before you apply.

Read Company Rants →