Bespoke Labs
Applied AI Research
InfrastructureEngineer
Neural analysis suggests this role is
optimal for Mid+ candidates.
“Infrastructure Engineer at Bespoke Labs. Skills: Environment Execution, Performance & Scale, Environment Platform. Own sandboxing and execution layer. Build systems to snapshot and restore state”
What You'll Achieve.
make agents reliable; environments have to stay coherent; systems that run reliably in production; run far more environment rollouts per dollar
Industry & Context.
hard systems problem; systematic approach
What They're Looking For.
Must Have
track record building production systems or research infrastructure at scale, distributed systems, execution engines, container/sandboxing infrastructure, Deep comfort with the systems layer, containers and isolation, filesystems, process and state management, Experience making systems fast and cheap, profiling, scheduling, resource utilization, cost optimization at scale, cloud platforms (GCP, AWS), distributed computing, engineering fundamentals, systematic approach to testing, validation, and reliability, Comfort operating in ambiguity, Excellent communication skills, Ability to translate between research needs and infrastructure requirements, Comfortable presenting technical work
Nice to Have
Python comfort in a systems language (Rust, Go, or C++), Experience with RL training or evaluation infrastructure, experience with checkpoint/snapshot-restore systems, CRIU, distributed state management, Background in high-throughput, low-latency execution systems, Contributions to widely-used infrastructure, datasets, benchmarks, or open-source systems, Previous experience in a research engineering or infrastructure role at an AI or systems-heavy company
What You'll Do.
Own sandboxing and execution layer
Build systems to snapshot and restore state
Develop machinery to detect failure modes
Extend execution to long-horizon environments
Own platform performance characteristics
Drive utilization and scheduling
Profile and remove bottlenecks
Build and maintain framework for environments
Create tooling for debugging
Scale prototypes into production systems
Write documentation and tools
How You'll Work.
Team & Collaboration
work closely with research and data teams; directly with frontier labs and enterprise customers; working with research teams; working with enterprise customers; translate between research needs and infrastructure requirements; presenting technical work to diverse audiences
Communication Scope
Excellent communication skills; Ability to translate between research needs and infrastructure requirements; Comfortable presenting technical work to diverse audiences
Full Job Description
About Bespoke Labs Bespoke Labs is an applied AI research lab pioneering data and RL environment curation for training and evaluating agents. Recently, we curated Open Thoughts, one of the best open reasoning datasets used by multiple frontier labs, trained SOTA specialized models such as Bespoke-MiniChart-7B and Bespoke-MiniCheck, and built the environment infrastructure that frontier labs and enterprises use to make their agents reliable. Bespoke is uniquely positioned to capture a large share of data and RL environment curation. About the Role We're looking for an Infrastructure Engineer to own the execution layer beneath our RL environments: the systems that let an agent operate inside a realistic, multi-tool world coherently for hours or days. This is a hard systems problem disguised as an AI job. As the tasks agents can complete keep lengthening, the environments that train them have to stay coherent across far longer horizons than anything that exists today. That means sandboxing and isolation you can trust, execution that's fast and cheap enough to run at training scale, and the ability to snapshot, restore, inspect, and branch a running environment instead of treating every rollout as one-shot. You'll build the platform that makes all of this possible. You'll work closely with our research and data teams, and directly with frontier labs and enterprise customers, to turn environment designs into infrastructure that runs reliably in production. What You'll Do 1. Environment Execution & Sandboxing: - Design and own the sandboxing and execution layer that environments run inside. Build systems to snapshot and restore environment state (disk, process, and where relevant memory and accelerator state) so runs can be paused, resumed, inspected, and branched rather than executed once. - Develop the machinery to detect failure modes early in a rollout (reward hacks, infra faults, fairness issues) and to revert to a known-good state, patch, and continue. - Extend exec
Applying for this Infrastructure Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Bespoke Labs?
Real rants from real employees. Read before you apply.