Gimlet
Technology
Infrastructure/ClusterEngineer
Neural analysis suggests this role is
optimal for Mid+ candidates.
“Infrastructure / Cluster Engineer at Gimlet. Skills: Cluster infrastructure, Heterogeneous hardware, AI inference. Design clusters. Deploy clusters”
Industry & Context.
Debugging complex issues; Troubleshooting
What They're Looking For.
Must Have
Deep Linux systems experience, Experience operating Kubernetes, Automation skills using tools, Experience with GPU infrastructure, Familiarity with high-performance networking, Operational judgment
Nice to Have
Experience building AI inference, Experience with bare-metal provisioning, Experience with multi-tenant cluster isolation, Experience debugging distributed workload performance, Experience building observability platforms, Familiarity with heterogeneous hardware environments
What You'll Do.
Provision infrastructure
Scale provisioning systems
Operate cluster scheduling
Manage resource allocation
Debug production issues
Build networking infrastructure
Support inference workloads
Evaluate hardware platforms
Integrate accelerators
Integrate datacenter designs
Establish operational standards
Develop incident response
How You'll Work.
Team & Collaboration
Distributed systems teams; Runtime teams; Compiler teams; Hardware teams
Full Job Description
About Us Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration platform that coordinates them. The future of AI will require vastly more compute than exists today. But as AI workloads become more complex and new hardware architectures emerge, simply deploying more GPUs isn't enough. The challenge is making increasingly diverse compute work together. Gimlet's platform intelligently partitions and routes workloads across heterogeneous hardware, enabling step-function improvements in performance and efficiency. Customers deploy through production-grade APIs without needing to think about hardware selection, placement, or optimization. We work with foundation labs, hyperscalers, and AI-native companies to power production workloads at massive scale and help define the infrastructure layer for the future of AI. ABOUT THIS ROLE We are looking for an Infrastructure / Cluster Engineer to design, build, and operate the cluster infrastructure behind Gimlet’s heterogeneous inference cloud. Unlike traditional cloud platforms built around a single hardware ecosystem, Gimlet's infrastructure spans multiple accelerator vendors and architectures. Infrastructure engineers play a key role in bringing new hardware platforms online, building the operational abstractions that make heterogeneous infrastructure manageable at scale, and ensuring new silicon can serve production workloads reliably from day one. This role is highly hands-on. You will work across bare metal, Linux, Kubernetes or cluster schedulers, high-speed networking, observability, provisioning, and incident response. You will partner closely with distributed systems, runtime, compiler, and hardware teams to ensure Gimlet’s infrastructure can support demanding AI workloads at production scale. WHAT YOU WILL WORK ON - Design, deploy, and operate large-scale CPU, GPU, and accelerator clusters powering production AI inference. - Build automation for provisioning, config
Applying for this Infrastructure / Cluster Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Gimlet?
Real rants from real employees. Read before you apply.