Signal AI
AI technology
SiteReliabilityEngineer
Neural analysis suggests this role is
optimal for Mid+ candidates.
“Site Reliability Engineer at Signal AI. Skills: AWS, Terraform, Python, Go. Run infrastructure. Evolve infrastructure”
What You'll Achieve.
Reduce cost; Build observability; Scale work; Absorb infrastructure; Make things better; Achieve measurable result
Industry & Context.
Solve operational problems; Think in distributed systems; Take problems end-to-end; Pragmatic about AI tooling; Tell when to reach for LLM; Clear reason; Know where to look
On-call shift
What They're Looking For.
Must Have
AWS, Terraform, Python, Go, distributed systems, failure modes, observability, blast radius, AI tooling
Nice to Have
Networking depth, TCP/IP fundamentals, DNS, VPC design, Operational security instincts, Linux internals comfort, Communication across technical levels
What You'll Do.
Evolve infrastructure
Absorb infrastructure
Integrate acquisition infrastructure
Consolidate batch jobs
Onboard observability stack
Complete on-call shift
Work Claude Enterprise
How You'll Work.
Team & Collaboration
Join Infrastructure team; Collaborate with product; Work with infrastructure teammates; Adapt naturally
Communication Scope
Communicate openly; Push back; Communicate concepts clearly; Adapt naturally
Process & Methodology
Own workstream end-to-end, Drive multi-quarter workstream
Full Job Description
We're on a mission to change the way businesses make decisions with our cutting-edge AI technology. To achieve that, we’re looking for passionate people to join our open and inclusive workplace. Our inclusive environment welcomes skills and experiences from diverse backgrounds, and defines who we are. We're hiring an SRE to help us run and evolve the infrastructure behind Signal AI's decision intelligence platform. You'd be joining a small, collaborative Infrastructure team at a moment when the work is genuinely changing shape. Over the last year we've hardened the platform, reduced cost, and built serious observability into our highest-volume systems. The next year is about scaling that work, absorbing infrastructure from a recent acquisition, and being thoughtful about how AI shows up in operational work: not as a gimmick, but as a tool we trust ourselves to use well. We're looking for someone who wants to shape the direction of the team; someone who brings curiosity and care to the work, and who wants to leave things meaningfully better than they found them. What we've shipped recently - Cut ~$50k/year off our Elasticsearch bill by migrating compute to more efficient chips. (Apr 2026) - Built the foundation for our MCP server platform: leveraging and contributing to open-source tooling to give the whole company extensible, production-grade AI integrations. (2025–2026) - Rebuilt production from scratch in a full DR gameday. End-to-end restore validated across our multi-account AWS setup. (Jan 2026) What we're working on next - AI-augmented operations: Claude Enterprise is deployed across Signal. We want this team to help define what good looks like for SRE: incident triage, runbook generation, capacity planning, cost analysis. This is a strategic investment, not a side project: and we'd love someone genuinely curious about what these tools can and can't do. - Security in the age of AI The threat landscape has shifted. Supply chain security is more at t
Applying for this Site Reliability Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Signal AI?
Real rants from real employees. Read before you apply.