Dropbox

Technology

StaffSiteReliabilityEngineer,ProductionEngineering

CA$199–302k Canada
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Staff candidates.

The Brief

“Staff Site Reliability Engineer, Production Engineering at Dropbox. Skills: Reliability strategy, Production engineering, Observability, Incident response. Define technical reliability strategy. Evolve technical reliability strategy”

Industry & Context.

Technology
Problems you'll solve

Diagnose technical problems; Debug production systems; Troubleshooting

Eligibility Requirements

On-call rotation

What They're Looking For.

Must Have

BS degree in Computer Science or related, 12+ years of experience in software engineering, Deep experience with distributed systems, Deep experience with production operations, Deep experience with observability, Deep experience with incident response, Deep experience with SLOs/SLAs, Deep experience with debugging, Deep experience with reliability risk management, Diagnose complex technical problems, Debug production systems, Automate operational workflows, Design resilient software components, Experience influencing engineering roadmaps, Make technical decisions

Nice to Have

Experience adapting reliability strategies for AI, Experience adapting developer tooling for AI, Experience adapting operational processes for AI, Experience building observability platforms, Experience scaling observability platforms, Experience building debugging platforms, Experience scaling debugging platforms, Experience building incident management platforms, Experience scaling incident management platforms, Experience building developer productivity platforms, Experience scaling developer productivity platforms, Experience leading reliability improvements, Track record of mentoring senior engineers, Track record of setting technical standards, Track record of spreading reliability best practices, Familiarity with AI-enabled tooling, Familiarity with agentic development workflows, Familiarity with operational risks from automation

What You'll Do.

Define technical reliability strategy

Evolve technical reliability strategy

Set reliability goals

Set reliability standards

Set reliability roadmaps

Lead cross-team initiatives

Reduce reliability risk

Partner to improve monitoring

Partner to improve alerting

Partner to improve debugging

Partner to improve SLOs

Partner to improve SLAs

Partner to improve incident response systems

Identify reliability risks

Design scalable systems

Design scalable processes

Design scalable guardrails

Provide technical leadership

Raise engineering quality

Raise reliability judgment

Raise operational excellence

Manage reliability priorities

Manage reliability tradeoffs

Manage reliability risks

Manage execution progress

How You'll Work.

Team & Collaboration

Partner across Engineering; Partner across Product; Partner across leadership; Partner with engineering leaders; Partner with platform teams

Communication Scope

Communication; Alignment; Stakeholder management

Process & Methodology

Roadmaps

Full Job Description

Role Description As a Site Reliability Engineer focused on company-wide reliability strategy, you will play a crucial role in advancing Dropbox’s stability, observability, incident response, and operational excellence as AI technologies reshape how software is built and operated. You will help define the reliability strategy for a new chapter of agentic development and AI-enabled software delivery, including preparing Dropbox for increases in pull request volume, service complexity, incident patterns, and demand for debugging and monitoring tools. You will partner across Engineering, Product, and leadership teams to raise the bar for reliability, guide long-term platform investments, and ensure Dropbox continues to deliver dependable experiences for millions of users. Our Engineering Career Framework is viewable by anyone outside the company and describes what’s expected for our engineers at each of our career levels. Check out our blog post on this topic and more here. Responsibilities Define and evolve Dropbox’s company-wide technical reliability strategy to support the changing engineering environment created by AI-assisted and agentic software development. Set multi-year reliability goals, standards, and roadmaps across observability, debugging, incident management, service health, and operational readiness. Lead cross-team initiatives that reduce reliability risk as software delivery velocity, pull request volume, service complexity, and incident volume increase. Partner with engineering leaders and platform teams to improve monitoring, alerting, debugging, SLOs, SLAs, and incident response systems at company scale. Identify emerging reliability risks introduced by AI-enabled development workflows and design scalable systems, processes, and guardrails to mitigate them. Provide technical leadership and mentorship to engineers across teams, raising engineering quality, reliability judgment, and operational excellence. Drive clear communication and alignment with

Free ATS check

Applying for this Staff Site Reliability Engineer, Production Engineering role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Dropbox?

Real rants from real employees. Read before you apply.

Read Company Rants →