RBC Borealis

Financial Services

StaffAI/MLEngineer

Calgary, Alberta, Canada FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Staff candidates.

The Brief

“Staff AI/ML Engineer at RBC Borealis. Skills: ML model serving, ML lifecycle management, Python backend development, ML pipeline engineering, Streaming/event-driven architectures, Containerized ML workloads, CI/CD for ML, Observability for ML systems, Scalable distributed backend systems, Site reliability practices. Own the end-to-end lifecycle of machine learning systems—from experimentation and validation all the way to high-throughput production serving. Technical anchor for model operational”

What You'll Achieve.

Shape the foundation on which Canada's largest financial institution runs its most critical AI workloads; Ensure low-latency, high-throughput inference in production; Support online model serving and event-driven ML workflows; Expose ML capabilities to downstream consumers; Embed quality gates and automated testing throughout CI/CD pipelines; Drive incident response and blameless post-mortems; Operate reliably under high load in hybrid cloud environments; Revolutionize finance through world-class research, solutions, and a resilient data platform; Solve critical challenges in the financial industry; Build intelligent, and scalable, data-driven solutions that will help communities thrive and drive innovation for our customers across the bank

Industry & Context.

Financial Services

What They're Looking For.

Must Have

production-proven experience with ML model serving and lifecycle management using SageMaker, MLflow, or comparable platforms, Expert-level Python skills for backend service development, ML pipeline engineering, and automation scripting, Deep hands-on experience with Apache Kafka and streaming/event-driven architectures for real-time feature pipelines and model inference, In-depth knowledge of OpenShift Container Platform (OCP4) / Kubernetes for deploying and operating containerized ML workloads, Proven experience building and maintaining CI/CD pipelines with GitHub Actions or equivalent tools for ML model delivery, Hands-on expertise with observability platforms such as Datadog, Dynatrace, or Prometheus applied to distributed ML systems, Demonstrated ability to design scalable distributed backend systems that operate reliably under high load in hybrid cloud environments (AWS / Azure / on-prem), Experience with site reliability practices: SLOs/SLIs, alerting, incident management, and capacity planning for ML services

Nice to Have

Proficiency with MongoDB in production environments for storing model metadata, feature stores, or application state, Experience with Elasticsearch for log aggregation, search, and ML-adjacent analytics use cases, Familiarity with JavaScript or Go for building lightweight platform tooling or internal developer portals, Background in audio processing pipelines—speech recognition, audio feature extraction, or real-time audio streaming—for multimodal AI applications, Exposure to agentic AI systems, LLM orchestration frameworks, or self-hosted large language model infrastructure

What You'll Do.

Own the end-to-end lifecycle of machine learning systems—from experimentation and validation all the way to high-throughput production serving

Technical anchor for model operationalization at scale

Setting the bar for reliability

and engineering excellence across our AI platform

and operating scalable ML model-serving infrastructure using SageMaker

or equivalent platforms

high-throughput inference in production—without involvement in upstream model training

Architecting and maintaining real-time data and feature pipelines using Kafka and streaming frameworks to support online model serving and event-driven ML workflows

Developing and maintaining robust backend services in Python that expose ML capabilities to downstream consumers via reliable

Owning containerized deployment of ML workloads on OpenShift Container Platform (OCP4) / Kubernetes

including resource optimization

and rollout strategies

Building and maintaining CI/CD pipelines (GitHub Actions) for model validation

embedding quality gates and automated testing throughout

Instrumenting ML services with comprehensive observability—metrics

and traces—using Datadog

or equivalent driving incident response and blameless post-mortems

How You'll Work.

Team & Collaboration

Works directly with leading researchers in machine learning; Works collaboratively

Communication Scope

Well-documented APIs

Full Job Description

**_Job Description_** **Staff AI/ML Engineer** **What 's the opportunity?** We're looking for a seasoned Staff AI/ML Engineer to join the RBC Borealis AI Platform team. In this role you will own the end-to-end lifecycle of machine learning systems—from experimentation and validation all the way to high-throughput production serving. You will be the technical anchor for model operationalization at scale, setting the bar for reliability, observability, and engineering excellence across our AI platform. This is a rare opportunity to shape the foundation on which Canada's largest financial institution runs its most critical AI workloads. At RBC Borealis, you’ll be joining a team that works directly with leading researchers in machine learning, has access to rich and massive datasets, and offers the computational resources to support ongoing development in areas such as reinforcement learning, unsupervised learning and computer vision. You can find out more about our research areas at rbcborealis.com. **Your responsibilities include:** * Designing, building, and operating scalable ML model-serving infrastructure using SageMaker, MLflow, or equivalent platforms, ensuring low-latency, high-throughput inference in production—without involvement in upstream model training. * Architecting and maintaining real-time data and feature pipelines using Kafka and streaming frameworks to support online model serving and event-driven ML workflows. * Developing and maintaining robust backend services in Python that expose ML capabilities to downstream consumers via reliable, well-documented APIs. * Owning containerized deployment of ML workloads on OpenShift Container Platform (OCP4) / Kubernetes, including * resource optimization, autoscaling, and rollout strategies. * Building and maintaining CI/CD pipelines (GitHub Actions) for model validation, packaging, and deployment, embedding quality gates and automated testing throughout. * Instrumenting ML services with comprehensive observa

Free ATS check

Applying for this Staff AI/ML Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about RBC Borealis?

Real rants from real employees. Read before you apply.

Read Company Rants →