Amazon UK Services Ltd.

Fulfillment Operations Management, Ops Engineering, fulfillment ops

ReliabilityEngineer,GlobalReliabilityIntelligencePrograms

£85–130k ~AI est. London, England, United Kingdom FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Reliability Engineer, Global Reliability Intelligence Programs at Amazon UK Services Ltd.. Skills: Reliability Engineering, Root Cause Analysis, Failure Modes Analysis, Data Analysis. Lead Root Cause Analysis. Develop Failure Modes and Effects Analysis”

What You'll Achieve.

Eliminate true causes of failures; Prevent failures from happening again; Drive measurable improvements in uptime; Drive measurable improvements in performance; Identify risks early; Build smarter systems; Build more reliable systems

Industry & Context.

Fulfillment Operations Management, Ops Engineering, fulfillment ops

Problems you'll solve

Root cause analysis; Failure modes analysis; Data analysis; Troubleshooting; Diagnostics; Problem solving

Eligibility Requirements

Up to 50% travel

What They're Looking For.

Must Have

Bachelor's degree, Advanced Microsoft Excel, Data scripting languages, BI analytics tools, Large-scale data mining, Data for root cause analysis, Predictive and preventative maintenance, DevOps, Serverless, Software development and design, CI/CD, AI/ML, Storage, Networking, Databases, Infrastructure automation, Agile development, Software architecture/patterns, Modern cloud services, Written and verbal communication

Nice to Have

API design, Cloud architecture/deployment, Service-oriented architecture, Mobile development, Performance optimization, Databases design, Data modeling, Data pipeline design, Industry tools and scripting languages, Full software development lifecycle, Architecture and design, Software development, Automation, Version control tools, Network troubleshooting tools, System architecture, Scalability, Reliability, Performance in database environment, Research methodologies, Machine learning algorithms, Business-critical patterns, New metrics development, Improve business tools and processes

What You'll Do.

Lead Root Cause Analysis

Develop Failure Modes and Effects Analysis

Maintain Failure Modes and Effects Analysis

Improve Failure Modes and Effects Analysis

Analyze equipment and operational data

Identify systemic issues

Identify performance gaps

Translate findings into improvements

Maintain BI dashboards

Build automated reports

Maintain automated reports

Build performance metrics

Maintain performance metrics

Lead cross-functional execution

Partner with operations

Partner with engineering

Partner with maintenance

Partner with external vendors

Drive development of RCA/FMEA tools

Drive enhancement of RCA/FMEA tools

Work with DevOps teams

Work with technical teams

Test RCA/FMEA software

Collect user feedback

Establish reliability best practices

Standardize reliability best practices

Support policy creation

Support organizational adoption

Refine tools and systems

Improve tools and systems

Analyze failure trends

Identify recurring issues

Identify systemic gaps

Identify opportunities to improve reliability

Identify opportunities to improve performance

Support FMEA initiatives

Help teams identify risks

Help teams implement mitigation

Review high-impact events

Review completed RCAs

Ensure actionable outcomes

Collaborate with engineers

Collaborate with operators

Collaborate with vendors

Align corrective actions

Drive execution of corrective actions

Strengthen organizational learning

Strengthen failure prevention

How You'll Work.

Team & Collaboration

Cross-functional teams; DevOps teams; Technical teams; Operations teams; Engineering teams; Maintenance teams; External vendors; Across regions

Communication Scope

Present complex technical information; Clear and concise communication

Process & Methodology

Agile development

Full Job Description

A Reliability Engineer focused on RCA and FMEA hunts down the true causes of failures and eliminates them before they happen again. They lead high-impact investigations, turn data into clear actions, and drive measurable improvements in uptime and performance. This role also gets ahead of problems by identifying risks early through FMEA and building smarter, more reliable systems. If you enjoy solving complex problems, influencing decisions, and delivering real results at scale, this is where you do it. This role may require up to 50% travel. Key job responsibilities • Lead Root Cause Analysis (RCA) for high-impact and recurring failures, driving deep-dive investigations to identify true root causes and ensure effective, lasting corrective actions • Develop, maintain, and continuously improve Failure Modes and Effects Analysis (FMEA) to proactively identify risks, prioritize mitigation, and prevent future failures • Analyze equipment and operational data to identify trends, systemic issues, and performance gaps, translating findings into actionable reliability improvements • Build and maintain BI dashboards, automated reports, and performance metrics (e.g., uptime, MTBF, failure rates) to enable data-driven decision-making • Lead cross-functional execution of reliability improvements by partnering with operations, engineering, maintenance, and external vendors across multiple sites and regions • Drive development and enhancement of RCA/FMEA tools and software by working closely with DevOps and technical teams, including requirements gathering, testing, and user feedback • Establish and standardize reliability best practices, while supporting policy creation, training, and organizational adoption of RCA and FMEA methodologies A day in the life In this role, you will partner closely with DevOps teams to refine and improve tools and systems that support RCA and FMEA at scale. You will analyze failure trends to identify recurring issues, systemic gaps, and opportunities

Free ATS check

Applying for this Reliability Engineer, Global Reliability Intelligence Programs role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 78 detected · ranked by frequency

Root Cause Analysis ×6

Data modeling ×4

Data pipeline design ×4

Network troubleshooting ×4

System architecture ×4

Scalability ×4

Reliability ×4

Research methodologies ×4

Trend analysis ×4

Reliability Engineering ×3

Failure Modes Analysis ×3

Data Analysis ×3

Advanced Excel ×3

Data scripting ×3

Statistical software ×3

BI analytics ×3

Data mining ×3

Predictive maintenance ×3

Preventative maintenance ×3

Troubleshooting ×3

Diagnostics ×3

Infrastructure automation ×3

Agile development ×3

Software architecture ×3

Cloud services ×3

Software development ×3

Version control ×3

Performance ×3

Database management ×3

Machine learning algorithms ×3

Forecasting ×3

SQL ×2

Role Details

Work Mode Onsite

Type FULL TIME

Salary Band 75k-100k

AI-Extracted Insights

Domain Areas

material-handling-equipmentautomated-conveyor-systemsdevopsserverlesssoftware-developmentdesignci-cdai-ml

ANONYMOUS · UNFILTERED

What do employees actually say about Amazon UK Services Ltd.?

Real rants from real employees. Read before you apply.

Read Company Rants →