Dropbox

Technology

StaffSiteReliabilityEngineer,ProductionEngineering

CA$205–277k Canada

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Staff candidates.

The Brief

“Staff Site Reliability Engineer, Production Engineering at Dropbox. Skills: Reliability strategy, Production engineering, Observability, Incident response. Define and evolve company-wide technical reliability strategy. Support changing engineering environment created by AI-assisted and”

What You'll Achieve.

Advance Dropbox’s stability, observability, incident response, and operational excellence; Prepare Dropbox for increases in pull request volume, service complexity, incident patterns, and demand for debugging and monitoring tools; Raise the bar for reliability; Guide long-term platform investments; Ensure Dropbox continues to deliver dependable experiences

Industry & Context.

Technology

Problems you'll solve

Diagnose complex technical problems; Debug production systems; Troubleshooting

Eligibility Requirements

On-call rotation

What They're Looking For.

Must Have

BS degree in Computer Science or related technical field, 12+ years of experience in software engineering, site reliability engineering, infrastructure engineering, or related technical roles, Proven ability to define and deliver multi-year, multi-team reliability, infrastructure, or platform strategies, Deep experience with distributed systems, production operations, observability, incident response, SLOs/SLAs, debugging, and reliability risk management, Demonstrated ability to diagnose complex technical problems, debug production systems, automate operational workflows, and design resilient software components, Experience influencing engineering roadmaps across multiple teams, Making technical decisions that optimize for the broader engineering organization

Nice to Have

Experience adapting reliability strategies, developer tooling, or operational processes for AI-assisted software development workflows, Experience building or scaling observability, debugging, incident management, or developer productivity platforms for large engineering organizations, Experience leading reliability improvements in environments with high deployment velocity, complex service dependencies, and large-scale production systems, Track record of mentoring senior engineers, setting technical standards, and spreading reliability best practices, Familiarity with AI-enabled tooling, agentic development workflows, or operational risks introduced by rapid automation in the software development lifecycle

What You'll Do.

Define and evolve company-wide technical reliability strategy

Support changing engineering environment created by AI-assisted and

Set multi-year reliability goals

Lead cross-team initiatives that reduce reliability risk

Partner with engineering leaders and platform teams to

Identify emerging reliability risks introduced by AI-enabled development

Design scalable systems

and guardrails to mitigate

Provide technical leadership and mentorship to engineers

Raise engineering quality

and operational excellence

Drive clear communication and alignment with senior stakeholders

How You'll Work.

Team & Collaboration

Partner across Engineering, Product, and leadership teams; Partner with engineering leaders and platform teams; Alignment with senior stakeholders

Communication Scope

Clear communication; Alignment with senior stakeholders

Process & Methodology

Roadmaps

Full Job Description

Role Description As a Site Reliability Engineer focused on company-wide reliability strategy, you will play a crucial role in advancing Dropbox’s stability, observability, incident response, and operational excellence as AI technologies reshape how software is built and operated. You will help define the reliability strategy for a new chapter of agentic development and AI-enabled software delivery, including preparing Dropbox for increases in pull request volume, service complexity, incident patterns, and demand for debugging and monitoring tools. You will partner across Engineering, Product, and leadership teams to raise the bar for reliability, guide long-term platform investments, and ensure Dropbox continues to deliver dependable experiences for millions of users. Our Engineering Career Framework is viewable by anyone outside the company and describes what’s expected for our engineers at each of our career levels. Check out our blog post on this topic and more here. Responsibilities Define and evolve Dropbox’s company-wide technical reliability strategy to support the changing engineering environment created by AI-assisted and agentic software development. Set multi-year reliability goals, standards, and roadmaps across observability, debugging, incident management, service health, and operational readiness. Lead cross-team initiatives that reduce reliability risk as software delivery velocity, pull request volume, service complexity, and incident volume increase. Partner with engineering leaders and platform teams to improve monitoring, alerting, debugging, SLOs, SLAs, and incident response systems at company scale. Identify emerging reliability risks introduced by AI-enabled development workflows and design scalable systems, processes, and guardrails to mitigate them. Provide technical leadership and mentorship to engineers across teams, raising engineering quality, reliability judgment, and operational excellence. Drive clear communication and alignment with

Free ATS check

Applying for this Staff Site Reliability Engineer, Production Engineering role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 25 detected · ranked by frequency

Technical leadership ×4

Operational excellence ×4

Reliability strategy ×3

Observability ×3

Incident response ×3

Coding ×3

Debugging production systems ×3

Automate operational workflows ×3

Reliability judgment ×3

Production engineering ×2

Reliability risk management ×2

Software development

Site reliability engineering

Infrastructure engineering

Distributed systems

Production operations

SLOs

SLAs

Debugging

Resilient software components

AI-assisted software development

Agentic development

AI-enabled development workflows

Platform strategies

Technical standards

BEHAVIOURAL

LeadershipMentorship

Role Details

Experience 8–15 yrs

Level Staff

Category cloud-platform-(sub-team)

Salary Band 200k+

AI-Extracted Insights

Domain Areas

distributed-systemsproduction-operationsobservabilityincident-responseslosslasreliability-risk-managementai-assisted-software-development

ANONYMOUS · UNFILTERED

What do employees actually say about Dropbox?

Real rants from real employees. Read before you apply.

Read Company Rants →