Mizuho

Financial Services

SiteReliabilityEngineer

$0–0k New York, New York, United States FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“Site Reliability Engineer at Mizuho. Skills: Site Reliability Engineering, Grafana, Automation, Infrastructure as Code, Cloud Providers (AWS, Azure, GCP), Containerization (Docker, Kubernetes), Monitoring, Alerting, CI/CD. Design, implement, and manage automated deployment, monitoring, and alerting solutions. Build and support scalable infrastructure through Infrastructure as Code (IaC) tools”

What You'll Achieve.

maintaining the reliability, scalability, and overall performance of our production systems; minimize downtime

Industry & Context.

Financial Services
Problems you'll solve

Exceptional problem-solving and troubleshooting capabilities

Eligibility Requirements

Participate in on-call rotations

What They're Looking For.

Must Have

Demonstrated experience as a Site Reliability Engineer (SRE) or in a similar capacity, background in automation tools and methodologies such as Ansible, Terraform, or Jenkins, Advanced skills in monitoring and visualization with Grafana, Experience working with cloud providers like AWS, Azure, or Google Cloud, In-depth knowledge of containerization and orchestration tools (e. g. , Docker, Kubernetes), Familiarity with CI/CD pipelines and associated tools, Proficient scripting or programming abilities in languages like Python, Bash, or Go, Exceptional problem-solving and troubleshooting capabilities, Excellent communication and teamwork skills, Comfortable working in a fast-paced, ever-changing environment

Nice to Have

Hands-on experience with Prometheus or comparable time-series databases, Solid understanding of networking and security best practices, Knowledgeable in database administration and optimization strategies

What You'll Do.

and manage automated deployment

and alerting solutions

Build and support scalable infrastructure through Infrastructure as Code (IaC) tools

Use Grafana and other monitoring platforms to track system reliability and performance

Diagnose and resolve production issues quickly to minimize downtime

Create and maintain best practices and guidelines for SRE processes

Enhance observability by improving logging

Lead post-incident reviews and put preventative measures in place

How You'll Work.

Team & Collaboration

collaborates closely with development, operations, and product teams; Partner with development and operations for ongoing improvements to system reliability and efficiency; Mentor and educate team members on SRE methodologies and technologies

Communication Scope

Excellent communication and teamwork skills

Full Job Description

Join Mizuho as a Site Reliability Engineer! In this role you will play a crucial role in maintaining the reliability, scalability, and overall performance of our production systems. This position collaborates closely with development, operations, and product teams to automate workflows, monitor system health, and maintain robust services. Expertise in Grafana is vital for creating insightful visualizations and analyzing performance metrics. **Key Responsibilities:** * Design, implement, and manage automated deployment, monitoring, and alerting solutions. * Build and support scalable infrastructure through Infrastructure as Code (IaC) tools. * Use Grafana and other monitoring platforms to track system reliability and performance. * Partner with development and operations for ongoing improvements to system reliability and efficiency. * Diagnose and resolve production issues quickly to minimize downtime. * Create and maintain best practices and guidelines for SRE processes. * Enhance observability by improving logging, monitoring, and alert systems. * Participate in on-call rotations to ensure round-the-clock support for critical systems. * Lead post-incident reviews and put preventative measures in place. * Mentor and educate team members on SRE methodologies and technologies. **Qualifications:** * Bachelor’s degree (or equivalent experience) in Computer Science, Engineering, or a related area. * Demonstrated experience as a Site Reliability Engineer (SRE) or in a similar capacity. * Strong background in automation tools and methodologies such as Ansible, Terraform, or Jenkins. * Advanced skills in monitoring and visualization with Grafana. * Experience working with cloud providers like AWS, Azure, or Google Cloud. * In-depth knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes). * Familiarity with CI/CD pipelines and associated tools. * Proficient scripting or programming abilities in languages like Python, Bash, or Go. * Exceptional proble

Free ATS check

Applying for this Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Mizuho?

Real rants from real employees. Read before you apply.

Read Company Rants →