SimSpace
AI
StaffSiteReliabilityEngineer
Neural analysis suggests this role is
optimal for Staff candidates.
“Staff Site Reliability Engineer at SimSpace. Skills: Site Reliability Engineering, Kubernetes, Infrastructure Architecture, Observability. Define technical vision. Lead architecture”
What You'll Achieve.
Deliver software seamlessly; Improve developer velocity; Ensure deployments are robust, secure, repeatable; Achieve unparalleled visibility into system health; Reduce operational toil
Industry & Context.
Problem-solvers; Solve complex infrastructure challenges
Remote - U. S.
What They're Looking For.
Must Have
8+ years of experience in Site Reliability, Platform, or DevOps engineering, Deep software engineering skills, Architect complex, production-quality systems, Design clean interfaces, Build maintainable tooling, Dictate technical direction of internal toolchain, Highly proficient in at least one modern language (e.g., Go, Python), Deep, architectural understanding of Kubernetes in multi-tenant and multi-cluster production environments, Expert-level knowledge of Jsonnet and Grafana Tanka, Extensive experience architecting sophisticated CI/CD pipelines, Extensive experience architecting GitOps workflows using GitHub Actions, ArgoCD, Infrastructure-as-code principles at an enterprise scale, Systems-level thinking, Design architectures that self-hosted, on-premises, VMware-based, and air-gapped deployment models, Deep expertise with observability platforms, Proven ability to design alerting and monitoring strategies for complex distributed systems, Background in infrastructure security architecture, Container hardening, Network security, Vulnerability management, Delivering software to heavily regulated or customer-managed environments, Exceptional communication and stakeholder management skills, Service-oriented mindset, Ability to influence cross-functional leadership, Negotiate reliability tradeoffs, Align engineering teams behind a unified technical vision
Nice to Have
Kubernetes a plus
What You'll Do.
Define technical vision
Secure infrastructure
Architect resilient systems
Drive engineering standards
Solve complex infrastructure challenges
Provide technical leadership
Bridge site reliability
Architect systems and strategies
Design long-term automation frameworks
Lead CI/CD and Kubernetes platform evolution
Drive application packaging strategies
Architect multi-cluster deployment frameworks
Balance feature delivery with stability
Architect enterprise observability strategy
Design proactive monitoring frameworks
Design complex anomaly detection
Design distributed tracing frameworks
Drive infrastructure security posture
Embed container security
Embed zero-trust network segmentation
Embed automated compliance policies
Serve as strategic partner to development teams
Advocate for SRE culture
Design self-service tooling
Establish paved roads for developers
Reduce operational toil
Act as Incident Commander
Drive blameless post-mortems
Engineer systemic fixes
Act as technical mentor
Raise engineering excellence baseline
How You'll Work.
Team & Collaboration
Collaborate with product and engineering leadership; Partner with development teams; Align engineering teams
Communication Scope
Exceptional communication; Stakeholder management; Influence cross-functional leadership; Negotiate reliability tradeoffs; Align engineering teams
Full Job Description
SimSpace serves as an AI Proving Ground where organizations can confidently train, test, and outmaneuver adversaries in any environment. Trusted by allied governments, militaries, enterprises, and research institutions worldwide, SimSpace enables adaptive, AI-ready defenses that stay ahead of evolving threats. Founded in 2015 by experts from U.S. Cyber Command and MIT Lincoln Laboratory, the platform unifies training, testing, and validation in a realistic, live-fire simulation—helping teams evaluate security investments, optimize performance, and compress cyber readiness cycles from months to days. Why join SimSpace? We are an organization that is focused on building our culture and mindfully enhancing our atmosphere every day which is why we have collaborated on an integral value system. Our governing philosophy of being Human Centered is deeply embedded within our value system. We apply this philosophy to every one of our internal team members, external clients, and their customers. How Do We Work? We believe that people are at the center of everything we do. SimSpace fosters a culture of continuous learning, curiosity, and professional growth. That belief shows up in action: in-house training, internal and external learning platforms, cyber conferences, industry events, and dedicated time for skill development. Our people are empowered to shape their careers - and it shows. Year over year, SimSpace consistently outperforms industry benchmarks in internal mobility, promotions, and total rewards growth. Who Thrives Here? We are a team of innovators, protectors, and problem-solvers. We believe diversity of thought and experience fuels better solutions, and we’re committed to building teams that reflect the communities we serve. Whether you’re remote or office-based, you’ll collaborate with talented colleagues across departments and time zones, united by the mission to create a safer digital world. We invite you to apply today! About the Role We are looking for a St
Applying for this Staff Site Reliability Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about SimSpace?
Real rants from real employees. Read before you apply.