bet365

online gambling

SiteReliabilityEngineer

Stoke-on-Trent, United Kingdom FULL TIME
The Brief

“Site Reliability Engineer at bet365. Skills: Site Reliability Engineering principles, Observability tools and techniques, Programming languages (Python, Golang, JavaScript), Infrastructure as Code (IaC), Automation. Enhance system reliability, observability and performance through an engineering approach.. Assist with incident resolution and best practices.”

What You'll Achieve.

Ensure systems meet user demands.; Enhance overall service performance.; Reduce toil through automation.

Industry & Context.

online gambling
Problems you'll solve

Incident resolution; Effective remediation strategies

What They're Looking For.

Must Have

Excellent knowledge of Site Reliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction., Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty., Excellent knowledge of programming languages including Python, Golang and JavaScript., Knowledge and experience of modern software development techniques and lifecycles., Experience with Infrastructure as Code (IaC) automation and orchestration tools such as Ansible and Terraform., Prior experience working in a large scale, 24/7 enterprise where system uptime and stability is of paramount importance to the Business., Proficiency in shell scripting for automation and system management tasks.

Nice to Have

Keen interest of industry trends, particularly Platform Engineering.

What You'll Do.

Enhance system reliability

observability and performance through an engineering approach.

Assist with incident resolution and best practices.

performance and availability of critical systems.

Implement solutions that enhance reliability

including service instrumentation.

Improve logging practices.

Develop features for maintainability.

Engineer tools and automation for effective service management.

Write and contribute to code that enhances the reliability and observability of services.

Develop and maintain tools that facilitate effective management of systems.

Build sophisticated dashboards using a range of telemetry data and dash boarding technologies.

Maintain and administer existing monitoring and analytic toolsets.

Actively participate in live incident resolution and post-mortem analysis.

Drive initiatives to enhance system reliability and observability.

How You'll Work.

Team & Collaboration

Collaborate across multiple functions to integrate reliability and observability best practices into the software development life cycle.; Collaborate with the central Site Reliability Engineering and Observability teams to establish and uphold standards.; Assist teams in adhering to reliability and observability practices.; Work with IT Operations, providing and supporting the use of critical tooling.

Free ATS check

Applying for this Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on SmartRecruiters

  • SmartRecruiters often includes a video screening step — check camera and mic permissions.
  • Link your GitHub or portfolio directly in the profile section for technical roles.
  • Applications may be reviewed by AI scoring before reaching a recruiter — use keywords from the job description.

ANONYMOUS · UNFILTERED

What do employees actually say about bet365?

Real rants from real employees. Read before you apply.

Read Company Rants →