Monotype

Fonts

Manager,SiteReliabilityEngineering

Noida, India FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Manager, Site Reliability Engineering at Monotype. Skills: Site Reliability Engineering (SRE), Incident management, Automation, Observability, Team leadership. Own end-to-end reliability of production systems, ensuring uptime within defined SLAs. Lead and govern a 24x7x365 incident management team, ensuring quick response and resolution”

What You'll Achieve.

Ensuring high system availability; Fast incident response; Continuous improvement of platform reliability; Maintaining uptime; Reducing incidents; Improving response times; Building a more proactive and self-sufficient SRE function; Ensuring uptime within defined SLAs; Quick response and resolution; Prevent repeat issues; Reduce alert noise; Improve signal-to-noise ratio; Reduce production issues caused by releases; Ensure visibility, stability, and cost awareness for AI-driven systems; Build team maturity; Reduce dependency on senior members; Develop ownership and accountability within the team; Optimize cloud usage and reduce unnecessary spend; Balance reliability improvements with cost efficiency

Industry & Context.

Fonts

Problems you'll solve

Structured problem-solving; Analytical and problem-solving skills for handling complex production issues

What They're Looking For.

Must Have

10+ years of experience in SRE with proven experience managing production systems and 24x7 operations teams, hands-on experience with AWS and Kubernetes (EKS preferred), understanding of incident management, RCA, and production support models, Experience with monitoring/observability tools (Datadog, CloudWatch, ELK, Prometheus, Grafana), Experience driving automation and reducing operational toil, Understanding of microservices-based architectures, knowledge of release processes and production readiness practices, understanding of SLAs, SLIs, SLOs, and reliability metrics, Good understanding of cloud cost optimization (FinOps basics), Exposure to or experience supporting AI/ML workloads, leadership skills with experience managing and mentoring teams, Ability to stay calm and lead during high-severity incidents, communication and stakeholder management skills, Structured problem-solving and decision-making ability, analytical and problem-solving skills for handling complex production issues, Understanding of security best practices across infrastructure and applications, Ability to standardize processes and improve operational consistency

Nice to Have

Certification in relevant technologies (e. g. , AWS, Kubernetes) is a plus, Strategic mindset with ability to align reliability initiatives with business goals

What You'll Do.

Own end-to-end reliability of production systems

ensuring uptime within defined SLAs

Lead and govern a 24x7x365 incident management team

ensuring quick response and resolution

Act as escalation point during critical incidents and drive coordination across teams

Ensure proper incident tracking

and status page updates

Drive a blameless RCA culture across the team

Ensure all customer-impacting incidents are analysed with clear root causes

Track and drive closure of RCA action items to prevent repeat issues

Identify recurring patterns and push for permanent fixes

Own and improve observability using tools like Datadog

Guide teams on effective logging

and monitoring practices

Reduce alert noise and improve signal-to-noise ratio

Drive proactive monitoring and early detection of issues

Drive automation to reduce manual effort and operational toil

Identify repetitive issues and build solutions to eliminate them

Ensure runbooks and playbooks are created and followed for recurring incidents

Engineering & Platform teams to improve release quality and stability

Ensure proper readiness checks before production deployments (monitoring

Reduce production issues caused by releases

Support reliability and monitoring of AI/ML workloads in production and experimentation environments

and cost awareness for AI-driven systems

Bring structure and best practices as AI adoption grows

Lead and mentor a team of ~14 engineers across operations and SRE excellence

Build team maturity and reduce dependency on senior members

Develop ownership and accountability within the team

Partner with teams to optimize cloud usage and reduce unnecessary spend

Balance reliability improvements with cost efficiency

Ensure security best practices are followed across infrastructure and applications in collaboration with security teams

How You'll Work.

Team & Collaboration

Lead and mentor a team of ~14 engineers across operations and SRE excellence; Work closely with Engineering, Product and Platform teams; Ensure smooth coordination during incidents and releases; Communicate effectively with stakeholders during high-severity situations; Collaborate with stakeholders to align reliability and platform strategies with business goals; Ensure security best practices are followed across infrastructure and applications in collaboration with security teams

Communication Scope

Communicate effectively with stakeholders during high-severity situations

Process & Methodology

Manage production systems, Manage 24x7 operations teams, Drive automation, Reduce operational toil, Improve release quality and stability, Ensure production readiness, Build team maturity, Develop ownership and accountability, Optimize cloud usage, Balance reliability improvements with cost efficiency

Full Job Description

Are you our “** _TYPE_** ”? **Monotype Global** Named "One of the Most Innovative Companies in Design" by Fast Company, Monotype brings brands to life through type and technology that consumers engage with every day. The company's rich legacy includes a library that can be traced back hundreds of years, featuring famed typefaces like Helvetica, Futura, Times New Roman and more. Monotype also provides a first-of-its-kind service that makes fonts more accessible for creative professionals to discover, license, and use in our increasingly digital world. We work with the biggest global brands, and with individual creatives, offering a wide set of solutions that make it easier for them to do what they do best: **design beautiful brand experiences.** **Monotype Solutions India** Monotype Solutions India is a strategic center of excellence for Monotype and is a certified Great Place to Work® three years in a row. The focus of this fast-growing center spans Product Development, Product Management, Experience Design, User Research, Market Intelligence, Research in areas of Artificial Intelligence and Machine learning, Innovation, Customer Success, Enterprise Business Solutions, and Sales. Headquartered in the Boston area of the United States and with offices across 4 continents, Monotype is the world’s leading company in fonts. It’s a trusted partner to the world’s top brands and was named “One of the Most Innovative Companies in Design” by Fast Company. Monotype brings brands to life through the type and technology that consumers engage with every day. The company's rich legacy includes a library that can be traced back hundreds of years, featuring famed typefaces like Helvetica, Futura, Times New Roman, and more. Monotype also provides a first-of-its-kind service that makes fonts more accessible for creative professionals to discover, license, and use in our increasingly digital world. We are looking for an experienced and hands-on Site Reliability Engineering (SRE) Manage

Free ATS check

Applying for this Manager, Site Reliability Engineering role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 34 detected · ranked by frequency

Site Reliability Engineering (SRE) ×5

Automation ×5

Observability ×5

Incident management ×3

Managing production systems ×3

24x7 operations teams ×3

Reducing operational toil ×3

Microservices-based architectures ×3

Monitoring ×3

Team leadership ×2

AWS ×2

Kubernetes ×2

Datadog ×2

CloudWatch ×2

ELK ×2

Prometheus ×2

Grafana ×2

EKS

AI/ML workloads

RCA

Production support models

Release processes

Production readiness practices

SLAs

SLIs

SLOs

Reliability metrics

Cloud cost optimization

FinOps basics

Security best practices

Process standardization

Operational consistency

BEHAVIOURAL

LeadershipMentoringAbility to stay calm and lead during high-severity incidentsStructured problem-solvingDecision-making abilityAnalytical skills

Role Details

Seniority manager

Experience 10–10 yrs

Level Senior

Work Mode Remote Friendly

Type FULL TIME

Education Bachelor’s degree in computer science, Engineering, or relat

AI-Extracted Insights

Domain Areas

fontstype-and-technologyai-driven-workloadsai-ml-workloads

Certifications

AWSKubernetes

How to Apply on Workday

Workday has a multi-step form — save your progress after every section.
"Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Monotype?

Real rants from real employees. Read before you apply.

Read Company Rants →