Company

Technology

Engineer,ProductionEngineering

$155–215k ~AI est. San Francisco, California, United States FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Engineer, Production Engineering. Skills: Production Engineering, Infrastructure Security, Cloud Infrastructure, Kubernetes, Terraform, Compliance. Manage production infrastructure. Manage staging infrastructure”

Industry & Context.

Technology
Problems you'll solve

Troubleshooting; Root cause analysis

What They're Looking For.

Must Have

5+ years in Production Engineering, 5+ years in Platform Engineering, 5+ years in security-focused infrastructure role, Hands-on experience with Kubernetes, Hands-on experience with GCP, Comfortable with Terraform, Programming skills (Python, Go, TypeScript, etc.), Hands-on experience with compliance frameworks (SOC2), Hands-on experience with vulnerability management, Hands-on experience with secure system design

Nice to Have

Background with multi-tenant SaaS, Background with enterprise security requirements, Background with enterprise procurement requirements, Exposure to AI/ML infrastructure, Experience building security-sensitive product features, Experience supporting pentests, Experience supporting bug bounties, Experience deploying in customer VPCs, Experience operating in customer VPCs, Experience navigating enterprise networking constraints, Experience navigating enterprise security constraints, Experience navigating enterprise access constraints

What You'll Do.

Manage production infrastructure

Manage staging infrastructure

Evolve production infrastructure

Evolve staging infrastructure

Own environment configuration

Deploy within customer VPCs

Operate within customer VPCs

Adapt to infrastructure constraints

Adapt to security requirements

Adapt to enterprise networking

Build agent sandboxing

Maintain agent sandboxing

Ensure agents operate within boundaries

Route agents through API gateway

Own observability stack

Integrate observability tools

Provide system performance visibility

Provide agent runtime visibility

Lead infrastructure work for SOC2

Lead operational work for SOC2

Collect audit evidence

Implement controls for SOC2

Manage HackerOne engagement

Coordinate penetration tests

Triage bug bounty reports

Drive remediation of vulnerabilities

Audit application code for vulnerabilities

Contribute security-sensitive product features

Ensure product security coherence

Ensure infrastructure security coherence

Manage device management

Manage access controls

Design CI/CD workflows

Maintain CI/CD workflows

Support canary deployments

Support blue-green deployments

Automate shipping to production

How You'll Work.

Team & Collaboration

Cross-functional teams; Work across Kubernetes infrastructure; Work across cloud delivery; Work across agent sandboxing; Work across SOC2 compliance; Work across IT systems; Work across production observability; Contribute to product

Full Job Description

ENGINEER — PRODUCTION ENGINEERING Location: San Francisco Bay Area (Hybrid/Onsite) Type: Full-time Stage: Early-stage startup ABOUT THE ROLE We are building the control plane for AI agents in teams and companies. As a Production Engineer, you will own the infrastructure, security, and compliance systems that allow our platform to ship fast and run reliably at scale. This is not a traditional ops role — you will write real code, contribute directly to the product, and own the full security and compliance surface of an early-stage company. You'll work across Kubernetes infrastructure, cloud delivery, agent sandboxing, SOC2 compliance, IT systems, and production observability — and you'll contribute to the product itself, building security-sensitive features and auditing application code for vulnerabilities. If you want to own the production backbone for the agent-native era — from a Terraform module to a pentest to an API key implementation — we want to talk. WHAT YOU'LL OWN 1. Cloud & Kubernetes Infrastructure - Our Stack: Manage and evolve our production and staging infrastructure on GCP (GKE) using Terraform. Own DNS, networking, and environment configuration end-to-end. - Customer Environments: Deploy and operate within customer VPCs across AWS, Azure, and GCP — adapting to varied infrastructure constraints, security requirements, and enterprise networking configurations. - Agent Sandboxing: Build and maintain Kubernetes-based sandboxing for agent execution — ensuring agents operate within strict network boundaries and must route through our API gateway rather than having unfettered internet access. - Observability: Own our observability stack, including OpenTelemetry instrumentation and integrations with New Relic and Splunk, to give the team deep visibility into system performance and agent runtime behavior. 2. Security, Compliance & IT - SOC2 & Audits: Lead infrastructure and operational work to support SOC2 compliance, including audit preparation, evidence col

Free ATS check

Applying for this Engineer, Production Engineering role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Ashby

  • Ashby is a fast modern ATS — most applications take under 3 minutes.
  • The resume parser is strong; verify parsed experience dates and job titles.
  • Custom screening questions are often scored algorithmically — answer completely.
  • Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about this company?

Real rants from real employees. Read before you apply.

Read Company Rants →