Arbor Education

Education Technology

SiteReliabilityEngineer

£60–70k Malaysia FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Site Reliability Engineer at Arbor Education. Skills: Site Reliability Engineering, Performance monitoring and analysis, Capacity planning, Infrastructure as Code, Observability. Proactively monitor and analyse platform performance. Ensure world-class resilience and performance across the platform”

What You'll Achieve.

Ensure world-class resilience and performance across the platform; Ensure rapid resolution and minimising downtime

Industry & Context.

Education Technology

Problems you'll solve

Troubleshooting of incidents; Identify root cause and corrective actions

Eligibility Requirements

Unable to provide visa sponsorship at this time

What They're Looking For.

Must Have

Experience in performance monitoring and analysis, Capacity planning experience, Scripting and automation skills, with experience in relevant technologies, Experience with Infrastructure as Code, in particular, Terraform, Understanding of relational database technologies and their cloud versions (e. g. AWS Aurora), Experience with messaging and distributed asynchronous workloads, Experience with nginx or similar technologies, Familiarity with SRE processes, Aware of DevOps principles like the 3 ways and 5 ideals

Nice to Have

Experience with other database technologies and cloud platforms, Past experience with Enterprise solutions running at scale, Familiarity with Kanban and Agile development processes, Experience with containerisation, for example Docker, Familiarity with software best practices such as Refactoring, Clean Code, Domain-Driven Design and Test-Driven Development

What You'll Do.

Proactively monitor and analyse platform performance

Ensure world-class resilience and performance across the platform

Advise on all aspects of site reliability including availability

observability and capacity planning

Continually improve observability through monitoring and alerting

Ensure the service is highly available and resilient

Champion best practices in design for high availability

Devise runbooks and run game sessions to test our DR plan

Conduct assessments of capacity and plan for scaling to meet current and future business needs

Strategize and implement scalable solutions

Ensure a good level of service is provided for our customers and embed SRE practices

Key player in the response and troubleshooting of incidents

ensuring rapid resolution and minimising downtime

Participate in blameless postmortems to identify root cause and corrective actions

Develop and maintain playbooks and documentation

How You'll Work.

Team & Collaboration

Collaborate with engineering teams to address performance bottlenecks and ensure scalability; Assist engineering teams with implementing and reviewing SLOs; Work with other teams to ensure it is effective and provides full coverage; Work closely with the Head of Platform Engineering and Head of SRE to strategize and implement scalable solutions; Work closely with the Platform team, feature teams and, 2nd line support and other stakeholders to ensure a good level of service is provided for our customers and embed SRE practices

Full Job Description

**Location:** Remote **Salary:** £60,000 - £70,000 ### About us At Arbor, we’re on a mission to transform the way schools work for the better. We believe in a future of work in schools where being challenged doesn’t mean being burnt out and overworked. Where data guides progress without overwhelming staff. And where everyone working in a school is reminded why they got into education every day. Our MIS and school management tools are already making a difference in over 7,000 schools and trusts. Giving time and power back to staff, turning data into clear, actionable insights, and supporting happier working days. At the heart of our brand is a recognition that the challenges schools face today aren’t just about efficiency, outputs and productivity - but about creating happier working lives for the people who drive education everyday: the staff. We want to make schools more joyful places to work, as well as learn. ### About the role We are looking for an enthusiastic and proactive Site Reliability Engineer to join our SRE team and help us ensure we provide world-class resilience and performance across the platform. The remit and focus of the role is to advise on all aspects of site reliability including availability, scalability, observability and capacity planning. It’s a broad and exciting role, so we’re looking for someone up for a challenge - if you’re an energetic and a collaborative Site Reliability Engineer, this is the role for you. ** Core responsibilities** * Proactively monitor and analyse platform performance. * Collaborate with engineering teams to address performance bottlenecks and ensure scalability. * Assist engineering teams with implementing and reviewing SLOs * Continually improve observability through monitoring and alerting, and dashboards, using tools such as DataDog or Prometheus for example. * Work with other teams to ensure it is effective and provides full coverage. * Ensure the service is highly available and resilient * Champion best pract

Free ATS check

Applying for this Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 25 detected · ranked by frequency

Performance monitoring and analysis ×5

Capacity planning ×5

Infrastructure as Code ×5

Site Reliability Engineering ×3

Scripting ×3

Automation ×3

Relational database technologies ×3

Messaging ×3

Distributed asynchronous workloads ×3

Containerisation ×3

Observability ×2

Terraform ×2

nginx ×2

Docker ×2

DataDog ×2

Prometheus ×2

AWS Aurora

DevOps principles

Kanban

Agile development processes

Software best practices

Refactoring

Clean Code

Domain-Driven Design

Test-Driven Development

BEHAVIOURAL

EnthusiasticProactiveCollaborativeEnergetic

Role Details

Seniority mid

Work Mode Remote

Type FULL TIME

Salary Band 50k-75k

AI-Extracted Insights

Domain Areas

school-management-toolseducation-sector

ANONYMOUS · UNFILTERED

What do employees actually say about Arbor Education?

Real rants from real employees. Read before you apply.

Read Company Rants →