Veeam Software

Data and AI Trust

StaffSiteReliabilityEngineer

India Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Lead candidates.

The Brief

“Staff Site Reliability Engineer at Veeam Software. Skills: Site Reliability Engineering, Distributed Systems, Public Cloud (Azure preferred), Observability, Infrastructure Automation, Container Orchestration (Kubernetes). serve as a hands-on technical leader within the SRE team. guiding senior engineers”

What You'll Achieve.

ensure the systems we operate are built to be reliable, scalable, and observable from the ground up; scaling SRE principles globally; ensure production readiness

Industry & Context.

Data and AI Trust

What They're Looking For.

Must Have

8+ years of experience in a Software Engineering or SRE role, technical leadership, Demonstrated experience mentoring and guiding senior engineers, Deep expertise in building distributed systems on public cloud (Azure preferred), skills in programming (e. g. , JS, Go, Typescript, Java, or C#), Hands-on experience with observability tooling (e. g. , Prometheus, Grafana, OpenTelemetry), Mastery of infrastructure automation tools (Terraform, Pulumi) and container orchestration (Kubernetes)

Nice to Have

Experience leading SRE initiatives across multiple product teams, Background in chaos engineering, incident learning, or performance and load testing, Familiarity with global compliance standards (ISO, SOC 2, GDPR, FedRAMP, CMMC)

What You'll Do.

serve as a hands-on technical leader within the SRE team

guiding senior engineers

influencing product development teams

ensuring the systems we operate are built to be reliable

and observable from the ground up

drive strategic initiatives

mentor others in the practice of SRE

help define architectural best practices across our platform

enforcing high standards

scaling SRE principles globally

drive adherence across engineering teams

partner with development and product teams to proactively design for failure

build resilient architecture

and operationalize reliability from the start

Drive company-wide adoption of observability best practices and tooling

and traces provide deep

actionable insights across systems

Lead complex incident responses

and systemic reliability improvements

Promote and enforce a blameless culture of learning and continuous improvement

Lead initiatives in infrastructure as code

deployment automation

and resilience testing

Influence the development and adoption of chaos engineering practices and release validation frameworks

Partner with platform and security teams to ensure production readiness

How You'll Work.

Team & Collaboration

Collaborate with Staff peers across teams to align strategy and champion shared reliability standards and goals; Partner with development and product teams; Work closely with your peer Staff Engineers to plan, align, and deliver against reliability goals; Represent the SRE team in technical leadership forums and product planning discussions

Communication Scope

Ability to communicate clearly across geographies and disciplines

Full Job Description

Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enable the acceleration of safe AI at scale. As the market leader in both data resilience and data security posture management, Veeam is built for the convergence of identity, data, security, and AI risk. Headquartered in Seattle with offices in more than 30 countries, Veeam protects over 550,000 customers worldwide, who trust Veeam to keep their businesses running. Join us as we go fearlessly forward together, growing, learning, and making a real impact for some of the world’s biggest brands. About the Role We are looking for a Staff Site Reliability Engineer, you will serve as a hands-on technical leader within the SRE team, guiding senior engineers, influencing product development teams, and ensuring the systems we operate are built to be reliable, scalable, and observable from the ground up. You will drive strategic initiatives, mentor others in the practice of SRE, and help define architectural best practices across our platform. This role is pivotal in aligning teams, enforcing high standards, and scaling SRE principles globally within Veeam. What You’ll Do Reliability Engineering drive adherence across engineering teams Collaborate with Staff peers across teams to align strategy and champion shared reliability standards and goals Partner with development and product teams to proactively design for failure, build resilient architecture, and operationalize reliability from the start Observability & Operational Excellence: Drive company-wide adoption of observability best practices and tooling Ensure metrics, logs, and traces provide deep, actionable insights across systems Lead complex incident responses, postmortems, and systemic reliability improvements Promote and enforce a blameless culture of learning and continuous improvement Engineering at Scale: Lead initiatives in infrastructure as code, deployment auto

Free ATS check

Applying for this Staff Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Greenhouse

  • Create a Greenhouse profile before applying — it saves time across multiple applications.
  • Upload your resume as a PDF; the parser handles it better than Word.
  • Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
  • Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about Veeam Software?

Real rants from real employees. Read before you apply.

Read Company Rants →