MongoDB

StaffTechnicalProgramManager,SiteReliabilityEngineering

Dublin, Ireland Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Staff Technical Program Manager, Site Reliability Engineering at MongoDB. Skills: Technical Program Management, Site Reliability Engineering, Cloud Platform, Cross-functional Coordination. Drive program planning & execution. Define program scope, milestones, success criteria”

What You'll Achieve.

Smoother launches; Clearer roadmaps; Stronger reliability metrics; SRE organization better-equipped to deliver predictability at scale; Deliver predictability at scale

Industry & Context.

Problems you'll solve

Ownership of hard problems end to end; Solve hard, ambiguous problems end to end

What They're Looking For.

Must Have

8+ years in technical program management, engineering management, or a comparable technical role partnering with software engineering teams, Proven track record leading large-scale, cross-team platform initiatives through ambiguity and change, knowledge of production change management, software development lifecycle, and reliability metrics (SLOs, SLIs), Skilled at shaping roadmaps and managing dependencies, Able to query and interpret metrics, logs, or other data sources to inform decisions and communicate risk, Excellent communicator—clear, concise, and calm—across engineers, cross-functional partners, and executives, Low-ego, highly collaborative, and motivated by ownership of hard problems end to end

Nice to Have

Hands-on or close-partner experience with Kubernetes, cloud networking, or observability stacks (metrics, logs, tracing, alerting), Prior experience working with or alongside SRE teams, Background in large-scale cloud infrastructure or platform engineering, Familiarity with MongoDB Atlas or other modern cloud database platforms

What You'll Do.

Drive program planning & execution

Manage dependencies across platform teams

Strengthen production reliability

Lead change management

Lead launch readiness programs

Partner with SREs and product teams

Define and operationalize SLOs/SLIs

Drive prioritization and continuous improvement

Lead cross-functional coordination

Align SRE with Security

Coordinate cross-team incident response

Ensure clear follow-through

Build trust as driver of complex efforts

Build scalable systems & processes

Design lightweight frameworks

Design communication patterns

Help SRE deliver reliably at scale

Empower teams to execute independently

How You'll Work.

Team & Collaboration

Partner with SRE leaders and engineers; Coordinate cross-functional efforts; Align SRE with Security, Compliance, Cloud platform, and other engineering teams; Coordinate cross-team incident response; Build trust as the go-to driver of complex, multi-team efforts; Build together with SRE, engineering, Security, and Compliance; Co-create solutions that work across the organization

Communication Scope

Excellent communicator—clear, concise, and calm—across engineers, cross-functional partners, and executives

Process & Methodology

Program Planning, Program Execution, Milestone definition, Success criteria definition, Dependency management, Roadmap shaping, Change management, Launch readiness programs, Prioritization, Continuous improvement, Cross-functional coordination, Incident response coordination, Process design, Framework design

Full Job Description

As a TPM for SRE, you will partner with SRE leaders and engineers to scale the platform that underpins all of MongoDB’s cloud products. You will drive program execution, strengthen production reliability practices, and coordinate cross-functional efforts across US and EMEA teams. Success in this role means smoother launches, clearer roadmaps, stronger reliability metrics and an SRE organization that's better-equipped to deliver predictability at scale. This role can be based out of our Dublin or Cork office or remotely in Ireland. What You'll Do Drive Program Planning & Execution – Define program scope, milestones, and success criteria with SRE engineers and leaders. Manage dependencies across platform teams, keep work clearly tracked in Jira, and deliver on time Strengthen Production Reliability – Lead change management and launch readiness programs. Partner with SREs and product teams to define and operationalize SLOs/SLIs, and use incident data, metrics, and capacity signals to drive prioritization and continuous improvement Lead Cross-Functional Coordination – Align SRE with Security, Compliance, Cloud platform, and other engineering teams. Coordinate cross-team incident response, ensure clear follow-through, and build trust as the go-to driver of complex, multi-team efforts Build Scalable Systems & Processes – Design lightweight frameworks and communication patterns that help SRE deliver reliably at scale. Work yourself out of the "hero" role by leaving teams better-equipped to execute independently Requirements 8+ years in technical program management, engineering management, or a comparable technical role partnering with software engineering teams Proven track record leading large-scale, cross-team platform initiatives through ambiguity and change Strong knowledge of production change management, software development lifecycle, and reliability metrics (SLOs, SLIs) Skilled at shaping roadmaps and managing dependencies Able to query and interpret metrics, logs,

Free ATS check

Applying for this Staff Technical Program Manager, Site Reliability Engineering role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 35 detected · ranked by frequency

Technical Program Management ×5

Kubernetes ×4

Cloud networking ×4

Cloud infrastructure ×4

Platform engineering ×4

Engineering management ×3

Platform initiatives ×3

Production change management ×3

Software development lifecycle ×3

Reliability metrics ×3

SLOs ×3

SLIs ×3

Metrics analysis ×3

Log analysis ×3

Data analysis ×3

Risk assessment ×3

Observability ×3

Database platforms ×3

Site Reliability Engineering ×2

Cloud Platform ×2

Cross-functional Coordination ×2

Observability stacks

MongoDB Atlas

Cloud database

Production reliability practices

Change management

Launch readiness

SLOs/SLIs

Incident response

Capacity planning

Roadmap shaping

Dependency management

Role Details

Experience 8–10 yrs

Level Senior

Work Mode Hybrid

Category pto-site-reliability-engineering

AI-Extracted Insights

Domain Areas

site-reliability-engineeringcloud-productsproduction-reliabilityscalable-systemscloud-infrastructureplatform-engineeringmodern-cloud-databases

ANONYMOUS · UNFILTERED

What do employees actually say about MongoDB?

Real rants from real employees. Read before you apply.

Read Company Rants →