Perform Reliably Amid Shifting Priorities

MongoDBSiteReliabilityEngineer

Pune, India FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“MongoDB Site Reliability Engineer at Perform Reliably Amid Shifting Priorities. Skills: MongoDB administration, Site Reliability Engineering, automation, performance tuning, incident response, observability. Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring”

What You'll Achieve.

ensure the reliability, availability, and scalability of the systems, platforms, and technology; reduce manual workload, increasing efficiency, and improving system resilience; reduce downtime and improve efficiency

Industry & Context.

Problems you'll solve

Resolution, analysis and response to system outages and disruptions; identify and address bottlenecks; troubleshooting skills for production incidents and performance degradation; Proactive problem-solving and ownership mentality; solve problems creatively and effectively

Eligibility Requirements

On-call rotation experience and production support

What They're Looking For.

Must Have

5+ years of MongoDB administration in production environments, Deep expertise in MongoDB architecture: replica sets, sharding, backup/recovery strategies, and disaster recovery, Performance tuning and optimization: query analysis, indexing strategies, and capacity planning, Proficiency in MongoDB shell, JavaScript scripting, and aggregation pipelines, troubleshooting skills for production incidents and performance degradation, Security best practices: authentication, encryption at rest/in transit, audit logging, Scripting expertise in Python and/or Bash for automation and operational tasks, CI/CD pipeline development and maintenance, Version control with Git and collaborative development practices, Database migration and upgrade strategies with zero/minimal downtime, Experience with observability platforms: Prometheus, Grafana, ELK/EFK stack, or similar, Incident management, root cause analysis, and post-mortem documentation, On-call rotation experience and production support, communication skills for cross-functional collaboration, Proactive problem-solving and ownership mentality, Documentation and knowledge-sharing practices

Nice to Have

Percona Server for MongoDB or MongoDB Enterprise experience, API development with FastAPI, Flask, or similar frameworks, Infrastructure as Code (IaC) using Terraform, Ansible, or Chef, Container orchestration with Kubernetes and Docker

What You'll Do.

and scalability of systems and services through proactive monitoring

and capacity planning

analysis and response to system outages and disruptions

and implement measures to prevent similar incidents from recurring

Development of tools and scripts to automate operational processes

reducing manual workload

increasing efficiency

and improving system resilience

Monitoring and optimisation of system performance and resource usage

identify and address bottlenecks

and implement best practices for performance tuning

Collaboration with development teams to integrate best practices for reliability

and performance into the software development lifecycle

work closely with other teams to ensure smooth and efficient operations

Stay informed of industry technology trends and innovations

actively contribute to the organization's technology communities to foster a culture of technical excellence and growth

automating operations

enhancing system observability

driving continuous improvements that reduce downtime and improve efficiency

How You'll Work.

Team & Collaboration

Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle; work closely with other teams to ensure smooth and efficient operations; Collaborate closely with other functions/ business divisions; collaborative assignments; guide team members through structured assignments; identify the need for the inclusion of other areas of specialisation to complete assignments; Collaborate with other areas of work, for business aligned support areas to keep up to speed with business activity and the business strategy; communication skills for cross-functional collaboration

Communication Scope

communication skills for cross-functional collaboration; Communicate complex information

Process & Methodology

lead collaborative assignments, identify new directions for assignments and/ or projects, identifying a combination of cross functional methodologies or practices to meet required outcomes

Full Job Description

# **Job Description** **Purpose of the role** To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. **Accountabilities** * Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. * Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. * Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. * Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. * Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations. * Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth. **Assistant Vice President Expectations** * To advise and influence decision making, contribute to policy development and take responsibility for operational effectiveness. Collaborate closely with other functions/ business divisions. * Lead a team performing complex tasks, using well developed professional knowledge and skills to deliver on work that impacts the whole business function. Set objectives and coach employees in pursuit of those objectives, appraisal of performance relative to objectives and determination of reward outcomes * If the position has leadership responsibilities, People Leaders are expected to demonstrate a clear set of leadership behaviours to create an environment for colleagues to thri

Free ATS check

Applying for this MongoDB Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Perform Reliably Amid Shifting Priorities?

Real rants from real employees. Read before you apply.

Read Company Rants →