Stellar Development Foundation

Blockchain

DirectorofSiteReliabilityEngineering

$275–450k ~AI est. San Francisco, California, United States FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Director candidates.

The Brief

“Director of Site Reliability Engineering at Stellar Development Foundation. Skills: Site Reliability Engineering, Infrastructure Engineering, Platform Engineering, Cloud Infrastructure. Lead SRE team. Set team vision”

What You'll Achieve.

Improve production services; Improve service ownership; Improve service operation; Improve service reliability; Make reliability measurable; Make operational maturity measurable; Make infrastructure health measurable; Make developer productivity measurable; Improve deployment automation; Improve resilience; Improve self-healing patterns; Improve disaster recovery readiness; Improve service reliability; Mature incident response; Mature escalation; Mature postmortems; Mature on-call health; Reduce toil; Lower cognitive load; Help teams move faster

Industry & Context.

Blockchain

Problems you'll solve

Technical judgment; Pragmatic leadership; Influence through trust; Clarity; Execution; Ownership; Bias toward solving problems; Leverage; Operational intelligence; Root cause analysis

Eligibility Requirements

Geographically distributed team

What They're Looking For.

Must Have

10+ years SRE experience, 5+ years leading engineers, Define team charters, Deep technical judgment, 3+ years AWS experience, 3+ years Kubernetes experience, Experience with observability, Experience with incident response, Experience with postmortems, Experience with on-call practices, Experience improving service ownership, Pragmatic approach to tooling, Operate effectively in small organization, Clear executive communication skills

Nice to Have

Experience leading SRE in lean organization, Experience supporting distributed teams, Experience improving developer productivity, Experience with infrastructure security, Experience in financial services, Experience evaluating vendors, Practical AI-assisted workflows experience

What You'll Do.

Define operating model

Define success measures

Roll out Service Ownership Framework

Improve core infrastructure services

Improve cloud foundations

Improve Kubernetes patterns

Improve observability

Improve secrets management

Improve GitHub workflows

Improve infrastructure automation

Help teams improve ownership

Help teams improve operations

Establish better standards

Establish better dashboards

Establish better runbooks

Establish better alerting

Establish better escalation paths

Establish better operational readiness

Establish better deployment practices

Make reliability measurable

Make operational maturity measurable

Make infrastructure health measurable

Make developer productivity measurable

Improve deployment automation

Improve self-healing patterns

Improve disaster recovery readiness

Improve service reliability

Mature incident response

Mature on-call health

Build self-service infrastructure

Partner with Security

Partner with Compliance

Partner with Procurement

Partner with Corporate IT

Evaluate AI-assisted workflows

Evaluate agentic workflows

How You'll Work.

Team & Collaboration

Distributed SRE team; Engineering teams; Security; Compliance; Legal; Finance; Procurement; Corporate IT; CTO; Senior engineering leaders

Communication Scope

Executive communication

Process & Methodology

Roadmaps, Success measures

Full Job Description

Interested in working on cutting-edge blockchain technology and creating equitable access to the global financial system? Since 2014, the mission-driven team at the Stellar Development Foundation (SDF) has helped fuel the tremendous growth of the Stellar blockchain network, an open-source platform that operates at high-scale today. Developers and companies around the world build on it, and the SDF team is expanding to support the rapidly growing and changing Stellar ecosystem. SDF is looking for a Director of Site Reliability Engineering to lead a small, high-leverage SRE team and help shape how engineering teams own, operate, and improve production services. This is a senior engineering leadership role reporting to the CTO. You will set the vision, operating model, and culture for SRE while owning the core infrastructure services that help SDF engineering teams build, deploy, observe, and operate software with confidence. Engineering teams at SDF own the services they build. SRE provides the frameworks, standards, shared infrastructure, tooling, observability practices, and enablement model that make strong service ownership possible across engineering. You will be successful here if you bring strong technical judgment, pragmatic leadership, and the ability to influence through trust, clarity, and execution. SDF is a small, mission-driven foundation with a broad technical surface area, so this role requires leverage, ownership, and a bias toward solving the right problems over creating processes for its own sake. In this role, you will: - Lead, coach, and develop a distributed SRE team, setting a clear vision, charter, operating model, priorities, and success measures. - Define and roll out a Service Ownership & Maturity Framework across engineering, with expectations that vary appropriately by service criticality. - Own and improve core engineering infrastructure services, including cloud foundations, Kubernetes and compute patterns, CI/CD, observability

Free ATS check

Applying for this Director of Site Reliability Engineering role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 52 detected · ranked by frequency

Cloud Infrastructure ×6

Kubernetes ×5

Incident response ×5

Container orchestration ×4

Infrastructure-as-code ×4

Declarative systems ×4

CI/CD ×4

Observability ×4

Monitoring ×4

Alerting ×4

Logging ×4

SLOs ×4

SLIs ×4

Postmortems ×4

On-call practices ×4

Service ownership ×4

Resilience ×4

Self-healing patterns ×4

Disaster recovery ×4

On-call health ×4

Access management ×4

Cloud operations ×4

Vendor review ×4

Infrastructure controls ×4

AI-assisted workflows ×4

Agentic workflows ×4

Production operations ×3

Distributed systems ×3

Automation ×3

Operational readiness ×3

Production accountability ×3

Secrets management ×3

BEHAVIOURAL

LeadershipJudgmentInfluenceClarityExecutionOwnershipPragmatism

Role Details

Level Director

Work Mode Hybrid

Type FULL TIME

Category site-reliability

Salary Band 200k+

AI-Extracted Insights

Domain Areas

blockchain-technologyglobal-financial-systemstellar-blockchain-networkdistributed-systemscloud-infrastructureproduction-operationsreliability-tradeoffsoperational-risk

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Stellar Development Foundation?

Real rants from real employees. Read before you apply.

Read Company Rants →