Amazon Data Services, Inc.
Technology
SupportEngineer,AWSIncidentResponse
“Support Engineer, AWS Incident Response at Amazon Data Services, Inc.. Skills: Incident response, Distributed systems, Cloud infrastructure, Automation. Lead incident response calls. Triage complex failures”
What You'll Achieve.
Strengthen AWS; Faster detection; Accurate detection; Reduce time-to-detection; Reduce time-to-mitigation; Improve team efficiency
Industry & Context.
Troubleshoot complex technical problems; Root cause analysis; Data-driven decision making
On-call rotation, Weekends, Holidays
What They're Looking For.
Must Have
2+ years technical support, Incident response experience, Linux understanding, Networking fundamentals, Distributed systems understanding, Operational monitoring experience, Troubleshoot complex problems, Scripting or programming experience, Clear technical communication
Nice to Have
Incident management tooling familiarity, AWS services experience, Cloud infrastructure experience, Generative AI experience, Automation experience, Author post-incident analyses, Build operational dashboards, Build runbooks, Build automation, Coordinate global teams
What You'll Do.
Lead incident response calls
Triage complex failures
Coordinate resolver teams
Drive incidents to mitigation
Ensure accurate documentation
Run operational health reviews
Obsess over detection accuracy
Obsess over detection speed
Detect patterns across events
Drive proactive mechanisms
Deep-dive operational data
Identify systemic issues
Measure response effectiveness
Prioritize improvements
Use metrics to tell story
Identify gaps in processes
Identify gaps in documentation
Identify gaps in tooling
Build mechanisms to reduce time-to-detection
Build mechanisms to reduce time-to-mitigation
Improve mechanisms to reduce time-to-detection
Improve mechanisms to reduce time-to-mitigation
Use data to prioritize effort
Leverage generative AI
Accelerate incident response
Identify AI opportunities
Augment human judgment
Surface insights from data
Ensure incidents drive improvements
Work with service teams
Ensure learnings drive corrective actions
Ensure follow-through happens
Close loop between broke and fixed
How You'll Work.
Team & Collaboration
Across AWS service teams; With service teams
Communication Scope
Clear communication; Technical communication
Process & Methodology
Incident management
Applying for this Support Engineer, AWS Incident Response role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Amazon Data Services, Inc.?
Real rants from real employees. Read before you apply.