Yahoo
SoftwareEngineer,SRETooling&ReliabilityPlatforms
“Software Engineer, SRE Tooling & Reliability Platforms at Yahoo. Skills: Software Development, Site Reliability Engineering (SRE), AI/ML, Cloud Infrastructure. Build, deploy, and maintain high-availability services that integrate our core Incident Management tools with Yahoo’s internal ecosystem. Develop new AI services and agents that ingest massive amounts of telemetry to provide on-call engineers with real-time summaries, historical context, and automated root-cause hypotheses”
What You'll Achieve.
Increase engineering velocity; Ensure the resilience of our global platforms; Provide service teams with instant context and automated triaging capabilities during high-pressure incidents; Turning noise into actionable insights
Industry & Context.
Problem Solver: You enjoy digging into how complex systems work and finding ways to make them better through automation.
May occasionally be asked to attend in-person events or team sessions with notice.
What They're Looking For.
Must Have
Proficiency in at least one programming language (Python or Go preferred), Working knowledge of AI-assisted development tools (e. g. , GitHub Copilot, Amazon CodeWhisperer, or Cursor), Understanding of Linux/Unix environments and basic networking concepts, Familiarity with Git and version control
Nice to Have
Experience with Cloud providers (AWS, GCP), Familiarity with Docker, Kubernetes, or serverless architectures, Demonstrated experience using Prompt Engineering to assist in debugging complex system failures or generating technical documentation, Interest in Machine Learning applications for DevOps and site reliability
What You'll Do.
and maintain high-availability services that integrate our core Incident Management tools with Yahoo’s internal ecosystem
Develop new AI services and agents that ingest massive amounts of telemetry to provide on-call engineers with real-time summaries
and automated root-cause hypotheses
Use Infrastructure as Code (Terraform/CloudFormation) to manage serverless infrastructure
ensuring reliability tools are as resilient as the services they monitor
Identify and implement AI-driven efficiencies in your day-to-day development
repetitive tasks with automated or AI-assisted workflows
Leverage AI pair-programming tools to accelerate code reviews and ensure high unit-test coverage for all new reliability services
Verify and validate AI-generated code and infrastructure outputs to ensure they meet Yahoo’s security and resilience standards
How You'll Work.
Team & Collaboration
Collaborative: You enjoy working in a team environment and are excited to learn from senior engineers and architects while proactively suggesting process improvements.
Applying for this Software Engineer, SRE Tooling & Reliability Platforms role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about Yahoo?
Real rants from real employees. Read before you apply.