OKX

Financial Services

Staff/SeniorStaffEngineer,Kubernetes

S$160–240k ~AI est. Singapore, Singapore
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Staff/Senior Staff Engineer, Kubernetes at OKX. Skills: Kubernetes operations, Cloud governance, Infrastructure automation. Manage K8s cluster lifecycle. Ensure 7x24 high availability”

Industry & Context.

Financial Services
Problems you'll solve

Fault diagnosis; Performance tuning; Incident response; Root cause analysis

Eligibility Requirements

Right to work in Singapore

What They're Looking For.

Must Have

4+ years experience operating production Kubernetes, Bachelor's degree or above in computer-related field, Proficient in K8s core principles, Able to resolve complex cluster failures, Proficient in Linux system, Familiar with mainstream container runtimes, Understand K8s networking, Understand K8s storage, Understand multi-cluster management, Experienced with CI/CD pipelines, Experienced with IaC tools

Nice to Have

Experience in large-scale public cloud environments, Multi-cloud cost optimization experience, Kubernetes security hardening experience, CKA/CKS certification, AI/LLM workload scheduling experience

What You'll Do.

Manage K8s cluster lifecycle

Ensure 7x24 high availability

Support continuous business iteration

Manage Alibaba Cloud configuration changes

Optimize Alibaba Cloud costs

Manage Alibaba Cloud disaster recovery

Achieve unified multi-cloud governance

Lead containerization operational optimization

Lead microservices operational optimization

Optimize Pod scheduling

Optimize resource quotas

Optimize network policies

Optimize image management

Optimize log monitoring

Resolve cluster resource fragmentation

Resolve business adaptation challenges

Resolve network interoperability challenges

Build K8s cluster monitoring

Build K8s cluster alerting

Build K8s cluster logging

Build distributed tracing

Define operations runbooks

Define change processes

Define incident response

Strengthen cluster security controls

Disable high-risk permissions

Harden container runtime environments

Ensure infrastructure data security

Ensure business data security

Develop operations automation scripts

Build automated release

Build automated inspection

Build automated backup

Implement Infrastructure as Code

Lead online incident response

Conduct root cause analysis

Produce post-mortem reports

Optimize cluster architecture

Optimize resource allocation

Optimize monitoring strategy

Track Cloud Native technology

Track public cloud technology

Document operations best practices

Document technical knowledge

Assist team improving multi-cloud K8s capabilities

How You'll Work.

Team & Collaboration

Cross-functional teams

Full Job Description

OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa. Who We Are At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er. OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more. What You’ll Be Doing K8s cluster lifecycle management: Own the build, scaling, version upgrades, daily operations, fault diagnosis, and performance tuning of large-scale production Kubernetes clusters; ensure 7×24 high availability and stable operations; support continuous business iteration. Alibaba Cloud manage configuration changes, cost optimization, and disaster recovery to achieve unified multi-cloud governance. Cloud-native architecture and optimization: Lead containerization and microservices operational rollout; optimize Pod scheduling, resource quotas, network policies, image management, and log monitoring systems; resolve cluster resource fragmentation, business adaptation, and network interoperability challenges. Stability and security: Build comprehensive K8s cluster monitoring, alerting, logging, and distributed tracing systems; define operations runbooks, change processes, and incident response plans; strengthen cluster security controls, disable high-risk permissions, harden co

Free ATS check

Applying for this Staff/Senior Staff Engineer, Kubernetes role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Greenhouse

  • Create a Greenhouse profile before applying — it saves time across multiple applications.
  • Upload your resume as a PDF; the parser handles it better than Word.
  • Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
  • Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about OKX?

Real rants from real employees. Read before you apply.

Read Company Rants →