Wikimedia Foundation
SeniorSiteReliabilityEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Site Reliability Engineer at Wikimedia Foundation. Skills: Site Reliability Engineering, DevOps, Infrastructure automation, Incident response. Perform day-to-day operational/DevOps tasks. Deploy infrastructure”
What You'll Achieve.
Improve reliability of website; Improve delivery of website
Industry & Context.
Troubleshooting; Root cause analysis
Travel 1-2 times a year, 24/7 on-call rotation
What They're Looking For.
Must Have
6+ years experience SRE/Operations/DevOps, Shell scripting experience, Python scripting experience, Bash scripting experience, Puppet configuration management, Distributed caching systems experience, Linux system-level troubleshooting, Automating tasks and processes, English language skills
Nice to Have
Linux kernel tuning experience, Monitoring infrastructure experience, Metrics infrastructure experience, Logging infrastructure experience, Prometheus experience, Grafana experience, Free and Open Source software development, Open-source community participation, LAMP stack technologies experience, MediaWiki experience, Cross-team SLOs definition, On-premise filesystem operation, Object store operation at scale, OpenStack Swift experience, Ceph experience, Advanced distributed storage systems, Advanced database systems experience, Cassandra experience, MariaDB experience
What You'll Do.
Perform day-to-day operational/DevOps tasks
Deploy infrastructure
Maintain infrastructure
Configure infrastructure
Troubleshoot infrastructure
Implement configuration management tools
Utilize deployment tools
Lead continuous improvement
Automate installation of services
Automate configuration of services
Automate maintenance of services
Assist in architectural design
Make services operate at scale
Participate in on-call rotation
Diagnose system outages
Collaborate with global team
Work in asynchronous environment
How You'll Work.
Team & Collaboration
Cross-functional team; Asynchronous communication environment
Communication Scope
Verbal communication; Written communication
Full Job Description
Summary The Wikimedia Foundation is looking for a Senior Site Reliability Engineer to support and develop the platform serving the world’s favorite encyclopedia, Wikipedia, to millions of people around the globe. Wikimedia’s Site Reliability Engineering (SRE) team is principally responsible for ensuring our global top-10 website and its underlying infrastructure is healthy and developing further in support of Wikimedia’s mission: to help everyone share in the sum of all knowledge. The SRE team at Wikimedia is a globally distributed and diverse team of engineers with a drive to explore, experiment, and embrace new technologies. We work in the open by publishing all of our documentation, code, and configuration as open source, and all our production systems are powered by open source software. We invite you to go through our documentation and code -- no login required. If you find what we do interesting, if you are up to the challenge of improving the reliability and delivery of one of the Internet’s top websites, and you enjoy the idea of working in a remote-first role, we may just be the right place for you. If you are interested in this role we’d expect you to be able to travel 1-2 times a year for in-person events and team meetings. Most importantly, share our values and work in accordance with them! You are responsible for: Performing day-to-day operational/DevOps tasks on Wikimedia’s public facing infrastructure (deployment, maintenance, configuration, troubleshooting) Implementing and utilizing configuration management and deployment tools (Puppet, Kubernetes) Leading continuous improvement, by automating the installation, configuration and maintenance of services on our platform Working closely with product teams helping them bring scalable functionality to our users by assisting in the architectural design of new services and making them operate at scale Participating in a 24/7 on-call rotation shared across the broader SRE team. This includes taking part in
Applying for this Senior Site Reliability Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about Wikimedia Foundation?
Real rants from real employees. Read before you apply.