Trigger. dev
AI Agents and Workflows
SeniorSiteReliabilityEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Site Reliability Engineer at Trigger. dev. Skills: Site Reliability Engineering, Distributed Systems, Cloud-native. Own observability. Extend OpenTelemetry instrumentation”
What You'll Achieve.
Keep platform fast; Keep platform observable; Keep platform hard to break; Scale platform; Handle hundreds of millions executions; Handle next order of magnitude; Run untrusted user code; Run untrusted customer code; High throughput; Meaningful repo scale; Meaningful team scale
Industry & Context.
Chasing bottlenecks; Hardening services; Making platform legible; Performance and scaling debugging
On call
What They're Looking For.
Must Have
OpenTelemetry, Prometheus, Distributed systems, Kubernetes, Postgres, Redis, Go, Linux, AWS, On call
Nice to Have
Container orchestration, MicroVMs, Firecracker, gVisor, Node.js, TypeScript, Remix, React, SDKs, Developer tools company, Commercial open source company, Venture-backed startup founder
What You'll Do.
Extend OpenTelemetry instrumentation
Design distributed systems primitives
Operate distributed systems primitives
Architect auto-scaling infrastructure
Tune auto-scaling infrastructure
Harden security posture
Own Terraform and IaC
Work on runtime internals
Design on-call practice
Make engineering faster
Make engineering safer
Contribute to architectural decisions
Contribute to technical roadmap
How You'll Work.
Team & Collaboration
Work across open source codebase; Work across Cloud product; Help customers; Review PRs; Create issues; Write docs; Create content; Align on culture fit
Full Job Description
ABOUT TRIGGER.DEV Trigger.dev http://Trigger.dev is a developer platform for building and running AI agents and workflows. We provide everything needed to create production-grade agents: an SDK, deploying, scaling, monitoring, and debugging them without needing to manage any infrastructure. Our Cloud product is a managed service where we deploy our users' code and auto-scale from zero to millions of executions. Today, we serve thousands of teams building AI apps and agents, handling hundreds of millions of executions per month. ABOUT THE POSITION We're hiring a Senior Site Reliability Engineer to keep Trigger.dev http://Trigger.dev fast, observable and hard to break as we scale. You'll work across our open source codebase and the Cloud product that runs it in production. We're handling hundreds of millions of executions a month on infrastructure we run ourselves, and the next order of magnitude needs someone who thinks in distributed systems and treats observability and security as part of the product, not bolted on later. Day to day you'll be chasing bottlenecks, hardening services like the sandbox runtime that executes untrusted user code, and making the platform legible to the engineers running it at 3am. WHAT YOU'LL BE DOING You'll do a variety of things including: - Owning observability across the platform. Extending our OpenTelemetry instrumentation, sanding down noisy signals, and making metrics, logs and traces something engineers actually reach for during incidents. - Designing and operating the distributed systems primitives we lean on (queues, schedulers, checkpoints, idempotency, backpressure) under real production load. - Architecting and tuning the auto-scaling infrastructure that runs untrusted customer code at high throughput. - Hunting bottlenecks across the stack, from Postgres query plans and Redis hot keys down to kernel, cgroup and network behaviour. - Hardening the security posture of our multi-tenant runtime: sandbox isolation, secrets handlin
Applying for this Senior Site Reliability Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Trigger. dev?
Real rants from real employees. Read before you apply.