Airbnb
SeniorSoftwareEngineer,ReliabilityEngineering
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Software Engineer, Reliability Engineering at Airbnb. Skills: Site Reliability Engineering, developing and maintaining tools and systems, incident management, distributed systems, cloud computing. developing and maintaining the tools and systems that enable our engineering teams to operate our services reliably and at scale. ensure our services are properly instrumented and able to scale with our growing business”
What You'll Achieve.
ensure our services are properly instrumented and able to scale with our growing business; bolstering our services' reliability; improving how the company manages incidents broadly; ensure that our services remain resilient as our business continues to expand; ensure timely resolution, minimizing the impact on our customers and business
Industry & Context.
Excellent problem-solving and analytical skills
on-call
What They're Looking For.
Must Have
5+ years of experience in software engineering or SRE roles, coding skills in at least one programming language, such as Java, Python, or Go, Experience with distributed systems and service-oriented architectures, Experience with cloud computing platforms such as AWS or Google Cloud Platform, conviction in software development best practices, including version control, automated testing, and continuous integration and delivery, Experience with containerization technologies such as Docker and Kubernetes, Excellent problem-solving and analytical skills, attention to detail, Ability to work effectively in a fast-paced and dynamic environment, communication and interpersonal skills, Fluent in English (Professional Level)
Nice to Have
Kubernetes a plus
What You'll Do.
developing and maintaining the tools and systems that enable our engineering teams to operate our services reliably and at scale
ensure our services are properly instrumented and able to scale with our growing business
bolstering our services' reliability
improving how the company manages incidents broadly
establish a culture of reliability throughout the organization by providing a comprehensive incident management platform that is being used for instrumentation
identify opportunities for improvement and drive their implementation
ensure that our services remain resilient as our business continues to expand
responding to and managing high severity incidents
serve as an active member of the Production SRE team
implement and maintain the tools and systems that support service reliability
Collaborate with other engineering teams to ensure services are designed with reliability in mind
and provide guidance on the appropriate use of tooling and automation
Identify opportunities to improve the reliability
and efficiency of our services and drive their implementation
Work with infrastructure engineers to understand the challenges they face in operating our services and develop tools and systems to help them manage these challenges
Participate in incident response and post-mortems to identify and address systemic issues
Continuously evaluate new technologies and industry best practices to improve our SRE tooling and incident response procedures
Gain and maintain an intimate understanding of how the critical parts of the site work (services
Lead high-urgency incidents
mentor less-experienced engineers in effectively handling incidents
How You'll Work.
Team & Collaboration
work closely with our SREs and other engineering teams; collaborating closely with other engineering teams; guide cross-functional teams during crisis situations
Communication Scope
communication and interpersonal skills; excellent communication and coordination skills
Full Job Description
Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way. The Community You Will Join: We are looking for a Senior Software Engineer to join our Site Reliability Engineering team. As a Senior Software Engineer in Production SRE, you will be responsible for developing and maintaining the tools and systems that enable our engineering teams to operate our services reliably and at scale. You will work closely with our SREs and other engineering teams to ensure our services are properly instrumented and able to scale with our growing business. The Difference You Will Make: In this role, your expertise in developing and maintaining tools and systems will be instrumental in bolstering our services' reliability and improving how the company manages incidents broadly. By collaborating closely with other engineering teams you will help establish a culture of reliability throughout the organization by providing a comprehensive incident management platform that is being used for instrumentation, operability, and around incidents. Your ability to identify opportunities for improvement and drive their implementation will contribute significantly to our overall operational efficiency and growth, ensuring that our services remain resilient as our business continues to expand. Additionally, as an essential part of this role, you will serve as an active member of the Production SRE team, responding to and managing high severity incidents. Your vast technical experience and leadership skills will be invaluable as you step into the role of Incident Commander during these critical events. You will guide cross-functional teams during crisis situations and ensure timely resolution, minimizi
Applying for this Senior Software Engineer, Reliability Engineering role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Airbnb?
Real rants from real employees. Read before you apply.