Hitachi Digital Services

digital solutions and transformation

SRE/DevOpsEngineer

Toronto, Ontario, Canada FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“SRE/DevOps Engineer at Hitachi Digital Services. Skills: System & Infrastructure Monitoring, Runbook Execution, Incident Triage & Communication, Kubernetes operations knowledge, Scripting (Python, Bash, PowerShell), Networking & Security Awareness, Documentation & Knowledge Capture. Monitoring system health, alerts, dashboards, and logs across cloud and on-prem infrastructure. Isolating functional issues with application versus platform”

What You'll Achieve.

minimize downtime

Industry & Context.

digital solutions and transformation

Problems you'll solve

troubleshooting mindset; ability to follow structured workflows; 5 Why?s; Fishbone

What They're Looking For.

Must Have

2–5 years in IT operations, NOC, or SRE/DevOps engineer role, Kubernetes 101, Linux 101, Networking 101, Understanding of cloud-ready applications, Understanding of observability tools (Prometheus, Grafana, ELK, Splunk, etc.), troubleshooting mindset, ability to follow structured workflows

Nice to Have

Cloud Platform Familiarity (AWS, Azure, GCP), Database Basics (SQL/NoSQL), Automation & Self-Service Mindset, Exposure to Incident Management Tools (xMatters, ServiceNow, Jira, etc.), AI/Chatbot-Assisted Ops (emerging skill)

What You'll Do.

Monitoring system health

and logs across cloud and on-prem infrastructure

Isolating functional issues with application versus platform

Executing standardized runbooks for incident resolution

Performing initial triage of incidents and escalating to L2/L2+ as needed

Documenting new issues

and automation opportunities

Supporting onboarding of new applications into the operations framework

How You'll Work.

Team & Collaboration

Provide excellent communication to stakeholders during incidents; escalate appropriately to minimize downtime; escalate to L2/L2+ as needed to mitigate the issue; notify stakeholders in clear, concise language; provide a detailed incident note to L2 before escalation; working alongside talented people you enjoy sharing knowledge with

Communication Scope

Provide excellent communication to stakeholders during incidents; notify stakeholders in clear, concise language; provide a detailed incident note to L2 before escalation

Full Job Description

**Function** Cloud & Data Engineering # **Our Company** We’re Hitachi Digital Services, a global digital solutions and transformation business with a bold vision of our world’s potential. We’re people-centric and here to power good. Every day, we future-proof urban spaces, conserve natural resources, protect rainforests, and save lives. This is a world where innovation, technology, and deep expertise come together to take our company and customers from what’s now to what’s next. We make it happen through the power of acceleration. Imagine the sheer breadth of talent it takes to bring a better tomorrow closer to today. We don’t expect you to ‘fit’ every requirement – your life experience, character, perspective, and passion for achieving great things in the world are equally as important to us. # **Job description** L1 SRE Operations Engineer The L1 SRE is the first line of defense in monitoring, triaging, and executing standardized operational tasks for all enterprise applications running on standard patterns and platforms like Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/GCP). They will followrunbooks, leverage automation, and escalate appropriately to minimize downtime. Responsibilities Monitor system health, alerts, dashboards, and logs across cloud and on-prem infrastructure. Ability to isolate functional issue with application versus platform Execute standardized runbooks for incident resolution, deployments, and routine tasks. Perform initial triage of incidents and escalate to L2/L2+ as needed to mitigate the issue to get tobypass. Document new issues, gaps in runbooks, and automation opportunities. Provide excellent communication to stakeholders during incidents. Support onboarding of new applications into the operations framework. Skills Mandatory Skills (Must-Have) 1\. System & Infrastructure Monitoring Expectation: Ability to use monitoring dashboards (e.g., Grafana, Datadog, Splunk, Argos, AIOps) toidentify anom

Free ATS check

Applying for this SRE/DevOps Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 116 detected · ranked by frequency

monitoring ×3

triaging ×3

executing standardized operational tasks ×3

runbooks ×3

automation ×3

escalate appropriately ×3

system health monitoring ×3

alert monitoring ×3

dashboard monitoring ×3

log monitoring ×3

cloud infrastructure monitoring ×3

on-prem infrastructure monitoring ×3

functional issue isolation ×3

incident resolution ×3

deployments ×3

routine tasks ×3

initial triage of incidents ×3

mitigate the issue ×3

documenting new issues ×3

identifying gaps in runbooks ×3

identifying automation opportunities ×3

onboarding of new applications ×3

using monitoring dashboards ×3

identifying anomalies ×3

following alert workflows ×3

escalating when thresholds are breached ×3

Kubernetes pod crash-loop validation ×3

checking pod logs ×3

performing restart attempts ×3

strictly following documented steps ×3

resolving standard incidents ×3

escalating when steps do not apply or fail ×3

BEHAVIOURAL

people-centricpassion for achieving great thingslife experiencecharacterperspectivepassion for achieving great thingsholistic health and wellbeingsense of belongingautonomyfreedomownershipenjoy sharing knowledge

Role Details

Seniority mid

Experience 2–5 yrs

Level Mid

Type FULL TIME

AI-Extracted Insights

Domain Areas

cloud-ready-applicationsobservability-tools

How to Apply on Workday

Workday has a multi-step form — save your progress after every section.
"Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Hitachi Digital Services?

Real rants from real employees. Read before you apply.

Read Company Rants →