Jagex
Gaming
SeniorCloudReliabilityEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Cloud Reliability Engineer at Jagex. Skills: Cloud Reliability Engineering, AWS Expertise, Cloud-Native Architectures, Infrastructure as Code, Observability, Automation, Linux Systems Administration. Keep RuneScape reliable, scalable and high-performing for players around the world. Work across Game, Central Tech and Cloud Platform teams to improve reliability, observability, automation and cloud-native adoption on Jagex’s hybrid-cloud platform”
What You'll Achieve.
Improve reliability, observability, automation and cloud-native adoption; Improve resilience, security and cost efficiency across live environments; Modernise safely without compromising uptime; Service reliability is measurable and better understood across teams; Faster insight into issues; Reducing time to detection; Reduce toil and improve recovery; Raise engineering standards
Industry & Context.
Root cause analysis; Reducing time to detection; Self-healing mechanisms
Based (or willing to relocate) within a comfortable commuting distance of our office to attend onsite as required
What They're Looking For.
Must Have
Proven experience owning reliability for large-scale, internet-facing services in production, Demonstrable AWS expertise across services such as VPC, EC2, ECS/EKS, ELB, ECR, Route53, KMS, IAM and Systems Manager, Proven capability in cloud-native design, workload modernisation and Infrastructure as Code delivery, Practical experience with SLIs, SLOs, incident response, root cause analysis and resilient system design, Demonstrable production experience with Debian-based Linux environments, virtual machine fleet management and configuration management tooling, Hands-on experience with observability platforms, CI/CD, containerisation and programming or scripting in Python or Java, Permanent right to work in the UK
Nice to Have
Cloud-native adoption on Jagex’s hybrid-cloud platform, Modernise services that directly affect player experience, Shape how Jagex delivers reliable live services at scale
What You'll Do.
Keep RuneScape reliable
scalable and high-performing for players around the world
Central Tech and Cloud Platform teams to improve reliability
automation and cloud-native adoption on Jagex’s hybrid-cloud platform
Move services toward cloud-native architectures
security and cost efficiency across live environments
Support the migration of workloads from managed VPS environments onto Jagex’s cloud platform
embed and improve SLIs
SLOs and error-budget thinking
Design and enhance observability and alerting across logs
Automate operational tasks such as scaling
failover and deployments
Build self-healing mechanisms that reduce toil and improve recovery
Contribute hands-on reliability improvements across Linux-based production systems
reusable IaC modules and team codebases
How You'll Work.
Team & Collaboration
Partner with game and development teams; Work across Game, Central Tech and Cloud Platform teams; Help raise engineering standards across Cloud Tech; Embrace Fellowship by collaborating and sharing openly
Communication Scope
Share openly
Full Job Description
**Location: Cambridge, UK** – Applicants should be based (or willing to relocate) within a comfortable commuting distance of our office to attend onsite as required. As a Senior Cloud Reliability Engineer in Cloud Tech, you’ll help keep RuneScape reliable, scalable and high-performing for players around the world. You’ll work across Game, Central Tech and Cloud Platform teams to improve reliability, observability, automation and cloud-native adoption on Jagex’s hybrid-cloud platform. This is a role with real breadth: hands-on production engineering, architecture influence across multiple teams, and the chance to modernise services that directly affect player experience. You’ll join a highly experienced team and help shape how Jagex delivers reliable live services at scale. **What you’ll be doing** * Partner with game and development teams to move services toward cloud-native architectures, improving resilience, security and cost efficiency across live environments. * Support the migration of workloads from managed VPS environments onto Jagex’s cloud platform, helping teams modernise safely without compromising uptime. * Define, embed and improve SLIs, SLOs and error-budget thinking so service reliability is measurable and better understood across teams. * Design and enhance observability and alerting across logs, metrics and traces, giving teams faster insight into issues and reducing time to detection. * Automate operational tasks such as scaling, failover and deployments, while building self-healing mechanisms that reduce toil and improve recovery. * Contribute hands-on reliability improvements across Linux-based production systems, reusable IaC modules and team codebases, while helping raise engineering standards across Cloud Tech. **What we’re looking for** * Proven experience owning reliability for large-scale, internet-facing services in production. * Demonstrable AWS expertise across services such as VPC, EC2, ECS/EKS, ELB, ECR, Route53, KMS, IAM and Systems
Applying for this Senior Cloud Reliability Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Jagex?
Real rants from real employees. Read before you apply.