Bybit
FinTech
SRELeader
Neural analysis suggests this role is
optimal for Lead candidates.
“SRE Leader at Bybit. Skills: Site Reliability Engineering, Infrastructure automation, Cloud operations, Cost optimization. Construct reliability engineering system. Establish SLO/SLA system”
What You'll Achieve.
Alarm accuracy > 95%; Deployment hourly; Reduce toil < 30%
Industry & Context.
Problem-solvers; Engineering methods
On-call system
What They're Looking For.
Must Have
More than 10 years experience, More than 5 years team leadership, Deep understanding of SRE methodology, Large-scale cost management experience, Systematic FinOps experience, Capacity modeling capability, Automated operation and maintenance practice, Successful toil reduction cases, Proficient in IaC tools, Experience in writing systems
Nice to Have
SRE management experience crypto exchanges, SRE management experience traditional securities, SRE management experience payment companies, Kubernetes large-scale cluster experience, High availability architecture experience, Experience building internal cost platforms, Experience building FinOps tools, Practical chaos engineering experience, Infrastructure preparation for compliance audits
What You'll Do.
Construct reliability engineering system
Establish SLO/SLA system
Define reliability indicators
Drive change based on Error Budget
Construct MTTD/MTTR measurement system
Optimize on-call system
Establish Runbook automated execution
Measure on-call quality
Deploy financial cloud isolation
Design network isolation architecture
Manage security groups
Implement Zero Trust Network architecture
Build compliance station infrastructure
Standardize compliance station templates
Automate inter-site isolation verification
Abstract cloud operation and maintenance
Design cross-regional disaster recovery
Guarantee data sovereignty
Guarantee wallet/transaction core chain
Operate hot and cold wallet isolation
Achieve transaction zero downtime change
Perform multiactive/disaster recovery switching
Push team to SRE transformation
Establish SRE competency model
Establish knowledge sedimentation mechanisms
Eliminate single-point personnel risk
Cultivate senior SREs
How You'll Work.
Team & Collaboration
Global team collaboration; Cross-functional teams
Process & Methodology
Capacity Planning, Incident Management
Full Job Description
About Us Established in 2018, Bybit is one of the world’s leading cryptocurrency exchanges and digital financial platforms, serving over 80 million users across more than 200 countries and regions. Powered by world-class technology and a user-first mindset, Bybit delivers a seamless ecosystem across trading, payments, wealth management, custody, institutional services, and Web3 — connecting users to the future of digital finance. Our core values define how we build. We listen, care and improve to create products and experiences that put users first. Backed by a global team of ambitious builders, problem-solvers, and innovators, we foster a high-performance and fast-moving environment where talent is empowered to drive real impact at the global scale. Supported by 24/7 multilingual customer service and a strong commitment to innovation, we are shaping the future of finance through technology, collaboration, and bold execution. Today, Bybit is recognized as one of the most trusted and transparent platforms in the digital asset industry, continuing to expand its global presence while building the infrastructure for the next generation of financial services. Core responsibilities Construction of reliability engineering system Establish a company-wide SLO/SLA system: Define quantifiable reliability indicators (availability, latency, error rate) for each Line of Business, and drive change rhythm and investment decisions based on Error Budget Construct MTTD/MTTR measurement system, set grading goals and continuously optimize: P-1 target MTTD 60% On-call system optimization: Alarm accuracy > 95% (eliminating alarm fatigue) Establish Runbook automated execution capability On-call quality measurement and continuous improvement Financial cloud isolation and multi-compliance station deployment (key) Financial-grade network isolation architecture design and operation and maintenance: Design and implementation of network isolation strategies for multiple accounts, multiple VPCs,
Applying for this SRE Leader role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about Bybit?
Real rants from real employees. Read before you apply.