Company
Technology
StaffSoftwareEngineer,Infrastructure
Neural analysis suggests this role is
optimal for Senior candidates.
“Staff Software Engineer, Infrastructure. Skills: Infrastructure platforms, Kubernetes, Cloud infrastructure, Observability. Lead design of internal infrastructure platforms. Evolve internal infrastructure platforms”
What You'll Achieve.
Reduce operational dependency on manual intervention
Industry & Context.
Technical challenges
On-call rotations
What They're Looking For.
Must Have
8+ years software engineering experience, Go programming experience, System design experience, Testing experience, Debugging experience, Long-term maintainability focus, Build, scale, operate cloud infrastructure, Kubernetes expertise, Cloud platforms expertise, Networking expertise, Reliability engineering expertise, Developer platforms expertise, Linux systems understanding, Networking fundamentals understanding, Production operations understanding, Lead technical direction, Drive cross-team alignment, RFCs experience, Architecture reviews experience, Design documentation experience, Terraform familiarity, CI/CD pipelines familiarity, GitOps (Argo CD) familiarity, Observability stacks familiarity, Prometheus familiarity, OpenTelemetry familiarity, Grafana familiarity, Written communication skills, Verbal communication skills
Nice to Have
EKS experience, Service mesh experience, Ingress experience, Progressive delivery experience, Large-scale platform migrations experience, Large-scale platform adoption initiatives experience
What You'll Do.
Lead design of internal infrastructure platforms
Evolve internal infrastructure platforms
Turn technical challenges into architectural solutions
Drive solutions through RFCs
Drive solutions through cross-team alignment
Build self-service platform capabilities
Build self-service APIs
Provisioning workflows
Observability workflows
Operational workflows
Create documentation for platform capabilities
Focus on adoption of platform capabilities
Define delivery standards
Implement delivery standards
Ensure safe deployments
Ensure repeatable deployments
Architect multi-tenant Kubernetes infrastructure
Improve multi-tenant Kubernetes infrastructure
Manage Kubernetes networking
Manage Kubernetes ingress
Manage Kubernetes traffic routing
Manage multi-region connectivity
Manage cross-account connectivity
Enhance platform reliability
Improve incident response processes
Drive adoption of platform systems
Ensure solutions are intuitive
Ensure solutions are safe
Reduce operational dependency on manual intervention
Participate in on-call rotations
Improve operational health
Implement better alerts
Implement long-term reliability engineering practices
How You'll Work.
Team & Collaboration
Cross-team alignment; Engineering teams
Communication Scope
Written communication; Verbal communication
Process & Methodology
RFCs, Architecture reviews, Design documentation
Full Job Description
## Accountabilities Lead the design and evolution of internal infrastructure platforms by turning ambiguous technical challenges into scalable architectural solutions and driving them through RFCs and cross-team alignment. Build self-service platform capabilities and APIs (primarily in Go) for provisioning, onboarding, deployment, observability, and operational workflows with strong documentation and adoption focus. Define and implement delivery standards using Terraform, GitOps (Argo CD), CI/CD pipelines, and progressive delivery strategies to ensure safe and repeatable deployments. Architect and improve multi-tenant Kubernetes (EKS) infrastructure, including networking, ingress (Envoy Gateway), traffic routing, and multi-region, cross-account connectivity. Enhance platform reliability through improved SLOs, monitoring, alerting, and incident response processes using observability tooling such as Grafana Cloud. Drive adoption of platform systems across engineering teams, ensuring solutions are intuitive, safe, and measurably reduce operational dependency on manual intervention. Participate in on-call rotations while continuously improving operational health through better alerts, runbooks, and long-term reliability engineering practices. Requirements: 8+ years of hands-on software engineering experience in backend, infrastructure, or platform engineering roles. Strong programming experience in Go or similar languages, with a focus on system design, testing, debugging, and long-term maintainability. Proven track record of building, scaling, and operating production-grade cloud infrastructure or platform systems. Deep expertise in at least one of: Kubernetes, cloud platforms, networking, reliability engineering, or developer platforms. Strong understanding of Linux systems, networking fundamentals, and production operations at scale. Experience leading technical direction and driving cross-team alignment through RFCs, architecture reviews, and design documentation. F
Applying for this Staff Software Engineer, Infrastructure role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Lever
- Lever uses a streamlined one-page form — apply in under 5 minutes.
- LinkedIn import works well; review parsed data before submitting.
- The cover letter field is optional but visible to reviewers — use it to differentiate.
- Referral codes from employees can significantly boost visibility of your application.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.