Amazon.com Services LLC

Technology

SoftwareDevelopmentEngineer(ElasticKubernetesService),EKSScalability&Performance

$144–194k Seattle, Washington, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Software Development Engineer (Elastic Kubernetes Service), EKS Scalability & Performance at Amazon.com Services LLC. Skills: Kubernetes, Distributed systems, Autoscaling. Build Vertical Auto-Scaling Service. Operate Vertical Auto-Scaling Service”

What You'll Achieve.

Customer outcomes measured; EKS uptime commitments enforced; Degradation detected before customers notice

Industry & Context.

Technology
Problems you'll solve

Root cause analysis

What They're Looking For.

Must Have

3+ years professional software development, 2+ years system design or architecture, 1+ years software development engineer experience, 1+ years designing distributed software applications, 1+ years Object Oriented Design experience, Bachelor's degree or foreign equivalent

Nice to Have

3+ years full software development life cycle experience, Bachelor's degree in computer science

What You'll Do.

Build Vertical Auto-Scaling Service

Operate Vertical Auto-Scaling Service

Build next-generation successor

Operate next-generation successor

Work on SLA measurement pipeline

Investigate breaching clusters

Build automation to detect degradation

Build automation to mitigate degradation

Contribute to control plane architecture

Define API server scaling

Define component scaling

Engage with upstream Kubernetes community

Drive KEPs for performance

Drive KEPs for resiliency

Work on workload identity systems

Work on Cluster Access Management

Work on EC2 capacity management

Work on grey failure detection

Work on Large-Scale Event response

Work on weight shifting

How You'll Work.

Team & Collaboration

Upstream community engagement

Full Job Description

We are looking for a Software Development Engineer to join the EKS KCP Scalability team and work on some of the hardest distributed systems problems at Amazon. You will design, build, and operate systems that directly determine whether EKS customers — from startups to the largest AI/ML workloads on the planet — experience a reliable, performant control plane. This is not a role where you implement features in isolation. You will work across the full stack: from the Kubernetes API server process and upstream community engagement, through autoscaling services that right-size control planes in real time, to the SLA measurement pipelines that hold us accountable to our customers. You will own systems end-to-end — from design through production operations — and your work will be measured by customer outcomes, not lines of code. Key job responsibilities You will build and operate the Vertical Auto-Scaling Service (VAS) and its next-generation successor (VAS 2.0), which dynamically right-sizes EKS control planes by evaluating CPU/memory utilization, etcd throttle rates, node-count thresholds, and network utilization simultaneously. You will work on the SLA measurement pipeline (MinutelySLA → DailySLA → MonthlySLA) that enforces EKS's uptime commitments, investigating breaching clusters weekly and building automation to detect and mitigate degradation before customers notice. You will contribute to the control plane architecture for EKS Ultraclusters, defining how the API server, etcd, and associated components scale to support 100,000-node clusters running generative AI workloads. You will maintain and extend version release qualification scale tests that gate every new Kubernetes version before it reaches customers. You will engage with the upstream Kubernetes community — driving KEPs that work backwards from EKS customer requirements around performance, scale, and resiliency. Depending on your interests and the team's priorities, you may also work on workload identity sy

Free ATS check

Applying for this Software Development Engineer (Elastic Kubernetes Service), EKS Scalability & Performance role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Amazon.com Services LLC?

Real rants from real employees. Read before you apply.

Read Company Rants →