General Intuition

InfraEngineer-API

$250–400k New York City, New York, United States; Stockholm, Sweden; London, United Kingdom; Paris, France; Geneva, Switzerland FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Infra Engineer - API at General Intuition. Skills: API, Kubernetes, GPU, Streaming. Own the video streaming protocol. Own the runtime layer of our API”

What You'll Achieve.

Turn models into a production API; Low-latency, highly available, billing-grade reliable API; Scale from hundreds to tens of thousands of concurrent users; Scale GPU fleet without breaking bank or latency budget; Own inference-performance backlog

Industry & Context.

What They're Looking For.

Must Have

A track record of personally scaling a high-traffic, low-latency API in production, Deep k8s experience, including multi-region deployments, Comfort with SLOs and capacity planning, ownership instinct

Nice to Have

Experience deploying streaming video or audio inference models, Experience with low-latency game streaming or video streaming infra, Experience scaling GPU fleets across providers, Experience with frontier model inference, Experience with on-device / edge inference

What You'll Do.

Own the video streaming protocol

Own the runtime layer of our API

Scale our k8s footprint across regions

Own the GPU hosting strategy

Drive latency and throughput

Partner with product engineering

How You'll Work.

Team & Collaboration

Work directly with the founding team; Partner with product engineering

Full Job Description

ABOUT GENERAL INTUITION We are the frontier research lab dedicated to building foundation models for environments that require deep spatial and temporal reasoning. For the past year, we've been pushing the forefront of AI across agents capable of navigating space and time, world models that provide training environments for those agents, and video understanding models with a focus on transfer to the real world. We raised a seed round of $133M from General Catalyst and Khosla to discover the next generation of intelligence. THE ROLE We're hiring an Infra Engineer to own General Intuition's API. Our research team builds frontier models — agents that reason about space and time, world models, video understanding. Your job is to turn those models into a production API that developers love: low-latency, highly available, billing-grade reliable, and able to scale from our first hundred users to tens of thousands of concurrent ones. You'll work directly with the founding team. You'll own the API end to end: the client libraries developers integrate with, how we receive frames from clients and stream actions back, how requests route to the right GPU, how sessions spin up and tear down, how k8s clusters get stood up in new regions, and how our GPU fleet scales. This is a true generalist infrastructure role. We are not looking for a pure API person or a pure GPU person — we are looking for someone who is exceptional at both, and who wants to own the entire surface end-to-end. KEY RESPONSIBILITIES - Own the video streaming protocol. Orchestrating how we receive frames from clients and route them to servers as efficiently as possible. - Own the runtime layer of our API. Stateful request routing, GPU session lifecycle, inference orchestration — the whole runtime stack. - Scale our k8s footprint across regions. Lead new regional deployments. - Own the GPU hosting strategy. Move us from dozens of GPUs today to potentially thousands (and beyond) without breaking the bank or the lat

Free ATS check

Applying for this Infra Engineer - API role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 28 detected · ranked by frequency

Kubernetes ×4

API scaling ×3

Low-latency API ×3

High-traffic API ×3

Multi-region deployments ×3

GPU fleet scaling ×3

Streaming video inference ×3

Audio inference ×3

Game streaming infra ×3

Video streaming infra ×3

Frontier model inference ×3

On-device inference ×3

Edge inference ×3

API ×2

GPU ×2

Streaming ×2

GCP ×2

Coreweave ×2

TypeScript ×2

Python ×2

Rust ×2

ExecuTorch ×2

Core ML ×2

Capacity planning

Developer-facing reliability

Observability

Metering

Billing-grade uptime

Role Details

Work Mode Onsite

Type FULL TIME

Category general-intuition

Salary Band 200k+

AI-Extracted Insights

Domain Areas

spatial-reasoningtemporal-reasoningworld-modelsvideo-understandingfrontier-modelsfoundation-models

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about General Intuition?

Real rants from real employees. Read before you apply.

Read Company Rants →