Amazon Data Services, Inc.

Technology

SupportEngineer,AWSIncidentResponse

$90–158k Seattle, Washington, United States FULL TIME
The Brief

“Support Engineer, AWS Incident Response at Amazon Data Services, Inc.. Skills: Incident response, Distributed systems, Cloud infrastructure, Automation. Lead incident response calls. Triage complex failures”

What You'll Achieve.

Strengthen AWS; Faster detection; Accurate detection; Reduce time-to-detection; Reduce time-to-mitigation; Improve team efficiency

Industry & Context.

Technology
Problems you'll solve

Troubleshoot complex technical problems; Root cause analysis; Data-driven decision making

Eligibility Requirements

On-call rotation, Weekends, Holidays

What They're Looking For.

Must Have

2+ years technical support, Incident response experience, Linux understanding, Networking fundamentals, Distributed systems understanding, Operational monitoring experience, Troubleshoot complex problems, Scripting or programming experience, Clear technical communication

Nice to Have

Incident management tooling familiarity, AWS services experience, Cloud infrastructure experience, Generative AI experience, Automation experience, Author post-incident analyses, Build operational dashboards, Build runbooks, Build automation, Coordinate global teams

What You'll Do.

Lead incident response calls

Triage complex failures

Coordinate resolver teams

Drive incidents to mitigation

Ensure accurate documentation

Run operational health reviews

Obsess over detection accuracy

Obsess over detection speed

Detect patterns across events

Drive proactive mechanisms

Deep-dive operational data

Identify systemic issues

Measure response effectiveness

Prioritize improvements

Use metrics to tell story

Identify gaps in processes

Identify gaps in documentation

Identify gaps in tooling

Build mechanisms to reduce time-to-detection

Build mechanisms to reduce time-to-mitigation

Improve mechanisms to reduce time-to-detection

Improve mechanisms to reduce time-to-mitigation

Use data to prioritize effort

Leverage generative AI

Accelerate incident response

Identify AI opportunities

Augment human judgment

Surface insights from data

Ensure incidents drive improvements

Work with service teams

Ensure learnings drive corrective actions

Ensure follow-through happens

Close loop between broke and fixed

How You'll Work.

Team & Collaboration

Across AWS service teams; With service teams

Communication Scope

Clear communication; Technical communication

Process & Methodology

Incident management

Free ATS check

Applying for this Support Engineer, AWS Incident Response role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Amazon Data Services, Inc.?

Real rants from real employees. Read before you apply.

Read Company Rants →