Emburse

Financial Services

SiteReliabilityEngineerIII

CA$135–195k ~AI est. Toronto, Ontario, Canada FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Site Reliability Engineer III at Emburse. Skills: Site Reliability Engineering, Cloud Infrastructure, Kubernetes, Automation. Develop infrastructure as code. Maintain infrastructure as code”

Industry & Context.

Financial Services
Problems you'll solve

Analytical; Reasoning; Troubleshooting; Problem-solving

What They're Looking For.

Must Have

Minimum of 3 years of direct experience in a similar role with a Bachelor’s degree in Computer Science or related STEM field, Minimum of 7 years of direct experience in a similar role without a Bachelor’s degree, Experience with infrastructure as code, Experience with full lifecycle of SaaS implementations, Kubernetes administration experience, Experience with containers, Experience with EKS, Experience with Kubernetes, Experience with GitHub Actions, Experience with Jenkins, Experience with ArgoCD, Experience with Kustomize, Experience with cloud-native deployment practices, AWS proficiency, Basic IAM management, Autoscaling, AMIs, Cloud infrastructure operations, Intermediate to advanced Linux and Unix skills, Understanding of TCP/IP, Understanding of OSI model, Understanding of stateless architecture, Understanding of infrastructure, Understanding of system architecture, Ability to write SQL, Ability to write ELK queries, Experience with monitoring applications, Experience with APM tools, Experience with logs, Experience with alerts, Experience with incident diagnostics, Experience with secure delivery practices, Experience with vulnerability scanning, Experience with static analysis, Experience with least-privilege infrastructure patterns, Ability to effectively use AI-assisted engineering tools, Ability to merge and apply pull requests for Ansible, Ability to merge and apply pull requests for Terraform, Ability to merge and apply pull requests for OpenTofu, Deep understanding of release cycles, Deep understanding of SDLC, Deep understanding of infrastructure, Deep understanding of architecture, Analytical skills, Reasoning skills, Troubleshooting skills, Problem-solving skills, Excellent written communication skills in English, Excellent verbal communication skills in English

Nice to Have

Kubernetes administration experience, including networking, security, troubleshooting, monitoring, and day-2 operations, AI-assisted engineering tools, such as Cursor, GitHub Copilot, Claude Code, or similar technologies, while applying sound engineering judgment, code review practices, and security awareness

What You'll Do.

Develop infrastructure as code

Maintain infrastructure as code

Administer Kubernetes environments

Administer EKS environments

Manage Kubernetes networking

Manage Kubernetes security

Troubleshoot Kubernetes

Manage Kubernetes clusters

Build containerized workloads

Manage containerized workloads

Support containerized workloads

Support GitOps workflows

Build self-service platform capabilities

Help engineering teams provision infrastructure

Help engineering teams onboard services

Help engineering teams deploy applications

Monitor site availability

Investigate production issues

Provide remediation for incidents

Create monitoring systems

Maintain monitoring systems

Create logging systems

Maintain logging systems

Create alerting systems

Maintain alerting systems

Create incident reporting

Maintain incident reporting

Support secure-by-default platform practices

Implement least-privilege Kubernetes configurations

Perform container vulnerability scanning

Perform static analysis

Perform infrastructure-as-code security checks

Troubleshoot infrastructure issues

Troubleshoot application platform issues

Troubleshoot Linux issues

Troubleshoot networking issues

Troubleshoot IAM issues

Troubleshoot cloud-related issues

Support production investigations

Create infrastructure pull requests

Review infrastructure pull requests

Merge infrastructure pull requests

Apply infrastructure pull requests

Support infrastructure optimization

Serve as technical lead

Drive platform projects

Leverage AI-assisted engineering tools

Improve development speed

Improve documentation

Improve troubleshooting workflows

How You'll Work.

Team & Collaboration

Engineering teams; Cross-functional teams

Communication Scope

Written communication; Verbal communication

Process & Methodology

SDLC, Release cycles

Full Job Description

## Description Who We Are: At Emburse, you’ll not just imagine the future – you’ll build it. As a leader in travel and expense solutions, we are creating a future where technology drives business value and inspires extraordinary results. Our AI-powered platform helps organizations modernize financial operations, increase visibility, and optimize spend across the enterprise. The Site Reliability Engineer III (SRE III) plays a critical role in ensuring Emburse’s systems are highly available, scalable, and performant. This role blends deep technical expertise with strong collaboration and leadership skills to drive operational excellence across distributed systems. The ideal candidate is passionate about automation, cloud infrastructure, observability, and continuous improvement, while mentoring junior engineers and driving reliability culture across the organization ## What you will do Responsibilities Develop and maintain infrastructure as code using Terraform, OpenTofu, Ansible, and related automation tooling. Administer Kubernetes and EKS environments, including installation, networking, security, troubleshooting, monitoring, autoscaling, upgrades, and cluster management. Build, manage, and support containerized workloads using Kubernetes, EKS, and related cloud-native technologies. Support GitOps-based deployment workflows using ArgoCD, Kustomize, GitHub Actions, Jenkins, and Kubernetes manifests. Build and improve self-service platform capabilities that help engineering teams provision infrastructure, onboard services, and deploy applications efficiently. Monitor site availability, investigate production issues, and provide remediation for incidents. Create and maintain monitoring, logging, alerting, dashboards, APM configuration, and incident reporting. Support secure-by-default platform practices, including least-privilege Kubernetes configurations, container vulnerability scanning, static analysis, and infrastructure-as-code security checks. Troubleshoot infra

Free ATS check

Applying for this Site Reliability Engineer III role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Lever

  • Lever uses a streamlined one-page form — apply in under 5 minutes.
  • LinkedIn import works well; review parsed data before submitting.
  • The cover letter field is optional but visible to reviewers — use it to differentiate.
  • Referral codes from employees can significantly boost visibility of your application.

ANONYMOUS · UNFILTERED

What do employees actually say about Emburse?

Real rants from real employees. Read before you apply.

Read Company Rants →