Universal DX

Healthcare

SiteReliabilityEngineer

$150–210k ~AI est. United States Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Site Reliability Engineer at Universal DX. Skills: Kubernetes, EKS, AWS, Observability. Own reliability, performance, and uptime. Define and monitor SLIs/SLOs”

Industry & Context.

Healthcare
Problems you'll solve

Troubleshooting; Root cause analysis

What They're Looking For.

Must Have

Bachelor's degree in Computer Science, 5+ years of experience in SRE, Proven experience with Kubernetes, Skills administrating AWS, Expertise with observability tools, Proficiency with Infrastructure as Code, Programming or scripting ability

Nice to Have

Experience leading SRE practices, Familiarity with CI/CD systems, AWS Certified Solutions Architect

What You'll Do.

Define and monitor SLIs/SLOs

Lead incident response

Perform root cause analysis

Implement long-term fixes

Enhance observability

Automate operational tasks

Plan infrastructure changes

Execute infrastructure changes

Implement scaling strategies

Implement safer rollout processes

Collaborate with Dev teams

Collaborate with Cloud teams

Collaborate with Security teams

Improve operational practices

Improve platform maturity

How You'll Work.

Team & Collaboration

Dev teams; Cloud teams; Security teams

Communication Scope

Communication skills

Process & Methodology

GitOps workflows

Full Job Description

About our Company: Universal DX, Inc. is an international Company with a highly experienced team focused on cracking cancer’s code. Through our multi-omics and bioinformatics models, we have figured out how to read the disease’s signals in blood with high accuracy to detect cancer in its earliest stages. Starting with a colorectal cancer screening liquid biopsy test, we are building a multi-cancer platform that can identify the unique DNA regions associated with different types of cancers. The Opportunity: Universal DX is seeking an experienced Site Reliability Engineer to join our growing team. You will be a key technical leader responsible for the reliability, scalability, and operational excellence of our production platforms, with a strong focus on EKS/Kubernetes and cloud infrastructure. You will be part of a team that is passionate about developing novel diagnostic tests for the early detection of cancers. As part of the team, you will be in a Company that aims more than to become one of the leaders in the industry. We want to have a huge positive impact on society by achieving the ambitious purpose of “making cancer a curable disease by detecting it earlier”. How you’ll contribute: Own the reliability, performance, and uptime of Kubernetes (EKS) clusters and shared platform services.  Define and monitor service-level indicators and objectives (SLIs/SLOs) that drive reliability decisions.  Lead incident response, root cause analysis, and implement long-term fixes.  Build and enhance observability (monitoring, logging, tracing) to surface issues before they impact customers.    Automate operational tasks and reduce toil using software engineering principles.    Plan and execute safe infrastructure changes, including cluster upgrades, scaling strategies, and safer rollout processes.  Collaborate with Dev, Cloud, and Security teams to improve operational practices and platform maturity.  Mentor and coach less senior engineers on reliability engineering best pract

Free ATS check

Applying for this Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Universal DX?

Real rants from real employees. Read before you apply.

Read Company Rants →