OKX
Financial Services
Staff/SeniorStaffEngineer,Kubernetes
Neural analysis suggests this role is
optimal for Senior candidates.
“Staff/Senior Staff Engineer, Kubernetes at OKX. Skills: Kubernetes, Cloud-native architecture, DevOps, Alibaba Cloud, AWS. Manage K8s cluster lifecycle. Ensure 7x24 high availability”
Industry & Context.
Fault diagnosis; Performance tuning; Incident response; Root cause analysis; Troubleshooting
Right to work in Singapore
What They're Looking For.
Must Have
4+ years experience operating production Kubernetes, Bachelor's degree or above in computer-related field, Proficient in K8s core principles, Able to resolve complex cluster failures, Proficient in Linux system, Familiar with mainstream container runtimes, Understand K8s networking, Understand K8s storage, Understand multi-cluster management, Experienced with CI/CD pipelines, Experienced with IaC tools
Nice to Have
Experience in large-scale public cloud environments, Multi-cloud cost optimization experience, Kubernetes security hardening experience, CKA/CKS certification, Experience with AI/LLM workload scheduling
What You'll Do.
Manage K8s cluster lifecycle
Ensure 7x24 high availability
Support continuous business iteration
Manage Alibaba Cloud configuration changes
Optimize Alibaba Cloud costs
Manage Alibaba Cloud disaster recovery
Lead containerization and microservices operations
Optimize Pod scheduling
Optimize resource quotas
Optimize network policies
Optimize image management
Optimize log monitoring
Resolve cluster resource fragmentation
Resolve business adaptation challenges
Resolve network interoperability challenges
Build K8s cluster monitoring
Build K8s cluster alerting
Build K8s cluster logging
Build distributed tracing
Define operations runbooks
Define change processes
Define incident response
Strengthen cluster security controls
Disable high-risk permissions
Harden container runtime environments
Ensure infrastructure data security
Ensure business data security
Develop operations automation scripts
Build automated release
Build automated inspection
Build automated backup
Implement Infrastructure as Code
Lead online incident response
Conduct root cause analysis
Produce post-mortem reports
Optimize cluster architecture
Optimize resource allocation
Optimize monitoring strategy
Assure long-term stability
Track Cloud Native technology
Track public cloud technology
Document operations best practices
Document technical knowledge
Assist team improving multi-cloud K8s operations
How You'll Work.
Team & Collaboration
Cross-functional teams; Technical knowledge sharing
Process & Methodology
CI/CD pipelines, IaC principles
Full Job Description
OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa. Who We Are At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er. OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more. What You’ll Be Doing K8s cluster lifecycle management: Own the build, scaling, version upgrades, daily operations, fault diagnosis, and performance tuning of large-scale production Kubernetes clusters; ensure 7×24 high availability and stable operations; support continuous business iteration. Alibaba Cloud manage configuration changes, cost optimization, and disaster recovery to achieve unified multi-cloud governance. Cloud-native architecture and optimization: Lead containerization and microservices operational rollout; optimize Pod scheduling, resource quotas, network policies, image management, and log monitoring systems; resolve cluster resource fragmentation, business adaptation, and network interoperability challenges. Stability and security: Build comprehensive K8s cluster monitoring, alerting, logging, and distributed tracing systems; define operations runbooks, change processes, and incident response plans; strengthen cluster security controls, disable high-risk permissions, harden co
Applying for this Staff/Senior Staff Engineer, Kubernetes role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about OKX?
Real rants from real employees. Read before you apply.