NVIDIA

SeniorStaffSoftwareEngineer-AIAgentPlatform

$200–391k Santa Clara, California, United States FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Senior Staff Software Engineer - AI Agent Platform at NVIDIA. Skills: AI agent platform infrastructure, distributed systems, Kubernetes, CI/CD, Python, Kafka, Redis, Auth/Identity Management, Secrets Management. design, build, and scale the infrastructure powering NVIDIA’s AI agent ecosystem. work at the intersection of distributed systems, developer platforms, and agentic AI”

What You'll Achieve.

enable teams across the company to develop, deploy, orchestrate, and operate autonomous AI agents at production scale

What They're Looking For.

Must Have

Bachelor's or Master's degree in Computer Science, Engineering, or related field (or equivalent experience), 12+ years in software engineering, Experience building and scaling AI agents in production using frameworks like Claude Code, Codex, or LangGraph, Deep Kubernetes expertise including pod orchestration, persistent storage, RBAC, and multi-cluster management, Python skills with production API experience using FastAPI, Flask, or similar async frameworks, Proven track record designing distributed systems with Kafka, Redis, and MongoDB or PostgreSQL, Expertise building and managing robust CI/CD pipelines using GitLab CI and ArgoCD for continuous delivery to Kubernetes, Experience designing AI data platform components (ingestion pipelines, vector stores, retrieval APIs, data preprocessing workflows) and building developer-facing platform APIs consumed by multiple engineering teams, Solid grasp of auth and identity: OAuth 2.0, JWT, token exchange, and secrets management with Vault, History of leading sophisticated technical projects such as migrations or greenfield platform builds

Nice to Have

Experience building or operating AI agent platforms or agentic workflow systems, with hands-on expertise in agent protocols and frameworks like MCP, A2A, LangChain, or LangGraph, Hands-on experience with RAG architectures, embedding pipelines, and vector databases (Milvus, Pinecone, or Weaviate), Full-stack skills with React or Vue for building developer portals and dashboards, Contributions to open-source infrastructure or platform tooling

What You'll Do.

and scale the infrastructure powering NVIDIA’s AI agent ecosystem

work at the intersection of distributed systems

build foundational services that enable teams across the company to develop

and operate autonomous AI agents at production scale

Build and develop platform services that own the full agent lifecycle from registration through deployment

Architect Kubernetes-based execution environments with pod lifecycle management

and identity propagation

Develop and maintain automated CI/CD pipelines using GitLab CI and ArgoCD

including reusable pipeline templates and deployment blueprints that standardize how agents are built across teams

Build framework-agnostic infrastructure supporting multiple agent SDKs (Claude Code

with hands-on experience using harnesses

skills configurability

Build and operate Kafka-based message pipelines and real-time event streaming using Redis PubSub and SSE

Develop data ingestion pipelines

and storage layers that power AI agent knowledge and context

Implement session management for state persistence

and agent recovery across sessions

Develop multi-layer auth using OAuth 2.0

and gateway integration

and manage secrets lifecycle with Vault (provisioning

Partner with security teams on compliance

and approval workflows for agent operations

How You'll Work.

Team & Collaboration

enable teams across the company to develop, deploy, orchestrate, and operate autonomous AI agents at production scale; partner with security teams on compliance, access controls, and approval workflows for agent operations; drive alignment across teams

Communication Scope

write clear design documents

Process & Methodology

leading sophisticated technical projects such as migrations or greenfield platform builds

Full Job Description

We are looking for a Sr. Engineer to design, build, and scale the infrastructure powering NVIDIA’s AI agent ecosystem. You will work at the intersection of distributed systems, developer platforms, and agentic AI — building the foundational services that enable teams across the company to develop, deploy, orchestrate, and operate autonomous AI agents at production scale. ****What you will be doing:**** * Build and develop platform services that own the full agent lifecycle from registration through deployment, execution, and teardown * Architect Kubernetes-based execution environments with pod lifecycle management, namespace isolation, persistent storage, and identity propagation * Develop and maintain automated CI/CD pipelines using GitLab CI and ArgoCD, including reusable pipeline templates and deployment blueprints that standardize how agents are built across teams * Build framework-agnostic infrastructure supporting multiple agent SDKs (Claude Code, OpenAI Codex, LangGraph), with hands-on experience using harnesses, lifecycle hooks, skills configurability, observability (OTEL), and memory services * Build and operate Kafka-based message pipelines and real-time event streaming using Redis PubSub and SSE * Develop data ingestion pipelines, access interfaces, and storage layers that power AI agent knowledge and context * Implement session management for state persistence, conversation history, and agent recovery across sessions * Develop multi-layer auth using OAuth 2.0, JWT validation, token exchange, and gateway integration, and manage secrets lifecycle with Vault (provisioning, rotation, container injection) * Partner with security teams on compliance, access controls, and approval workflows for agent operations ****What we need to see:**** * Bachelor's or Master's degree in Computer Science, Engineering, or related field (or equivalent experience), with 12+ years in software engineering — ideally in platform engineering, infrastructure, or developer tools * Exp

Free ATS check

Applying for this Senior Staff Software Engineer - AI Agent Platform role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about NVIDIA?

Real rants from real employees. Read before you apply.

Read Company Rants →