Huawei Canada

Technology

InternEngineerRLPost-TrainingforLLMs

CA$58–104k Vancouver, British Columbia, Canada INTERNSHIP
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for entry candidates.

The Brief

“Intern Engineer – RL Post-Training for LLMs at Huawei Canada. Skills: Reinforcement learning, LLMs, Post-training. Develop RL post-training pipelines. Optimize RL post-training pipelines”

What You'll Achieve.

Enhance algorithm performance; Enhance training efficiency

Industry & Context.

Technology
Problems you'll solve

Problem-solving skills

What They're Looking For.

Must Have

Master or Ph. D. student, Machine learning background, Reinforcement learning background, Deep learning background, Familiarity with LLMs, Familiarity with transformer architectures, Familiarity with post-training methods, Proficiency in Python, Proficiency in PyTorch, Proficiency in LLM frameworks

Nice to Have

Hands-on experience with LLMs, Hands-on experience with RL training algorithms, Familiarity with RL frameworks, Experience with Hugging Face, Experience with DeepSpeed, Experience with vLLM, Experience with SGLang, Experience with distributed training frameworks, Experience with large-scale experimentation, Experience with LLM infrastructure

What You'll Do.

Develop RL post-training pipelines

Optimize RL post-training pipelines

Improve model performance

Improve model reasoning

Improve model alignment

Build scalable training systems

Build evaluation systems

Build data generation systems

Collaborate with researchers

Collaborate with engineers

Stay current with advancements

How You'll Work.

Team & Collaboration

Cross-functional teams

Communication Scope

Communication skills

Full Job Description

Huawei Canada has an immediate 6-12 months internship opening for an Intern Researcher. About the team: The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies. This team focuses on full-stack innovations, including software-hardware co-design and optimizing data efficiency at both the storage and runtime layers. This team also develops next-generation GPU architecture for gaming, cloud rendering, VR/AR, and Metaverse applications. One of the goals of this lab are to enhance algorithm performance and training efficiency across industries, fostering long-term competitiveness. About the job: * Develop and optimize RL post-training pipelines for LLMs (e.g., GRPO, reward modeling). * Conduct experiments to improve model performance, reasoning, and alignment. * Build scalable training, evaluation, and data generation systems. * Collaborate with researchers and engineers on cutting-edge LLM projects * Stay current with advancements in RL, LLMs, and post-training research. The total target annual compensation(based on 2,080 hours per year) ranges from $58,000 to $104,000 depending on education, experience, and demonstrated expertise. ## Requirements About the ideal candidate: * Enrolled as Master or Ph.D. student in Computer Science, AI, or related field. * Strong background in machine learning, reinforcement learning, and deep learning. Familiarity with Large Language Models, transformer architectures, and post-training methods. * Proficiency in Python, PyTorch, and LLM frameworks. * Hands-on experience with LLMs and RL training algorithms (e.g., GRPO) is an asset. * Familiarity with RL frameworks, such as VeRL. * Experience with open-source LLM frameworks such as Hugging Face, DeepSpeed, vLLM, or SGLang is an asset. * Knowledge of domain-specific languages used with AI accelerators. * Experience with distributed training frameworks, large-scale

Free ATS check

Applying for this Intern Engineer – RL Post-Training for LLMs role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Huawei Canada?

Real rants from real employees. Read before you apply.

Read Company Rants →