Coinbase
StaffSiteReliabilityEngineer,CoreAIInfrastructure
Neural analysis suggests this role is
optimal for Senior candidates.
“Staff Site Reliability Engineer, Core AI Infrastructure at Coinbase. Skills: Site Reliability Engineering, AI Infrastructure, Cloud Infrastructure, Automation. Own reliability lifecycle. Own monitoring lifecycle”
What You'll Achieve.
Improve workflow efficiency; Improve cost; Improve quality
Industry & Context.
Root cause analysis; Troubleshooting
On-call support
What They're Looking For.
Must Have
8+ years automating cloud infrastructure, 8+ years supporting cloud infrastructure, 8+ years supporting network environments, Hands-on use of infrastructure-as-code tools, Deploying containerized workloads, Managing containerized workloads, Troubleshooting containerized workloads, Proficiency in at least one scripting language, Proficiency in at least one programming language, Version control workflows using Git-based CI/CD pipelines, Leading incident response, Root cause analysis, Blameless retros, Utilizes generative AI responsibly, Maintaining human oversight
Nice to Have
Expertise with linux, Expertise with bash, Expertise with ruby, Expertise with python, Expertise with go, Automating EC2 deployment, Automating containers deployment, Terraform, Network security fundamentals, Experience managing log aggregation, Experience leveraging log aggregation, Experience working in highly regulated environment, Experience in fast-paced company, Experience in high-growth company, Experience in Remote-first IT environment
What You'll Do.
Own reliability lifecycle
Own monitoring lifecycle
Own incident response lifecycle
On-call support for AWS deployment pipelines
Perform root cause analysis
Conduct blameless retros
Streamline operational IT workflows
Eliminate manual tasks
Improve deployment velocity
Extend CI/CD frameworks
Support enterprise network platforms
Integrate surveillance tooling
Strengthen observability standards
Strengthen documentation standards
Implement monitoring solutions
Maintain technical documentation
Develop full-stack applications
Power internal AI products
How You'll Work.
Team & Collaboration
Partner with Infrastructure team; Partner with Security; Partner with Compliance
Communication Scope
Technical documentation
Process & Methodology
CI/CD frameworks
Full Job Description
Ready to do the most impactful work of your career? At Coinbase, we are uncompromising on our mission to increase economic freedom. The bar is high, the environment is intense, and we like it that way. This isn't a place for complacency, it’s a place to be pushed past your perceived limits. If you're ready to build the future of finance alongside people who refuse to settle for "good enough," you belong here. Coinbase is a remote-first, but not remote-only company. Expect to get together quarterly for intense in-person working sessions called “surges.” learn more about working at Coinbase. You'll join a high-performing team of engineers driving AI transformation at Coinbase as a Staff Site Reliability Engineer on the IT Operations team. This team builds and scales the infrastructure powering Coinbase's AI products, with direct exposure to senior leadership in a fast-paced, incubator-style environment. You'll own the reliability and automation of critical AI infrastructure, ensuring our systems are resilient, observable, and secure at scale. What you’ll be doing (ie. job duties): Own the reliability, monitoring, and incident response lifecycle for AI infrastructure services, including on-call support for AWS deployment pipelines, root cause analysis, and blameless retros. Build automation and tooling to streamline operational IT workflows, eliminate manual tasks, and improve deployment velocity across CI/CD frameworks and Kubernetes environments. Partner with the Coinbase Infrastructure team to extend CI/CD frameworks supporting IT services and enterprise network platforms, and with Security and Compliance to integrate surveillance tooling into deployment pipelines. Strengthen observability and documentation standards across IT engineering by defining metrics, implementing monitoring solutions, and maintaining technical documentation that sets a standard of excellence. Develop full-stack applications that power internal AI products and infrastructure with Go or Pytho
Applying for this Staff Site Reliability Engineer, Core AI Infrastructure role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Coinbase?
Real rants from real employees. Read before you apply.