Cantina Labs

social AI

ResearchScientist

Singapore, Singapore FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“Research Scientist at Cantina Labs. Skills: foundational research on video generation models, post-training research, large-scale data systems or pipelines for machine learning workflows, distributed data pipelines, workflow orchestration, containerized pipeline infrastructure, cloud-based data storage and compute optimization, deduplication workflows, distillation methods for large-scale diffusion and flow-based video generation models, reward modeling and preference-based fine-tuning pipelines”

What You'll Achieve.

translate research findings into durable model improvements; preserving or improving generation quality while reducing inference cost; align video generation quality with human judgments across dimensions such as aesthetics, motion quality, and prompt adherence; inform pretraining decisions accordingly

Industry & Context.

social AI

Problems you'll solve

Analyze the relationship between base model behavior and post-training outcomes

What They're Looking For.

Must Have

hands-on experience building or scaling large-scale data systems or pipelines for machine learning workflows, Experience with distributed data processing frameworks such as PySpark or Ray, Experience with orchestration tools such as Airflow or equivalent, Familiarity with containerization and container orchestration, including Docker and Kubernetes, Experience working with cloud-based data storage and compute (AWS, GCS, and/or Azure), Familiarity with video and media processing tools such as FFmpeg, PyAV, DALI, or OpenCV, Familiarity with multimodal or media data, including video, image, text, and audio, research background in post-training methods for large-scale diffusion or flow-based generative models, deep hands-on experience in distillation across both inference efficiency and quality preservation, Experience with reward modeling or preference-based fine-tuning for generative models, including RLHF, DPO or equivalent alignment approaches, Solid understanding of the interplay between pretraining and post-training, and how base model properties affect distillation and fine-tuning outcomes, Proficiency in Python, Proficiency in modern machine learning frameworks

Nice to Have

Publications at top-tier venues (NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV) preferred, preference for PyTorch or JAX

What You'll Do.

drive foundational research on video generation models

taking ownership across the full research cycle

driving post-training research

collaborate closely with data

and adjacent modeling teams to translate research findings into durable model improvements

Build and maintain scalable systems for ingesting

and delivering large-scale video data for model training

Design and scale distributed data pipelines for preprocessing

and repeated dataset refreshes

Own workflow orchestration

and failure recovery for large-scale data processing jobs

Implement and maintain containerized pipeline infrastructure using Kubernetes or equivalent orchestration systems

Optimize cloud-based data storage and movement across providers (AWS

and operational efficiency

Define and implement best practices for dataset storage layout

Build tooling to support deduplication workflows at scale

including near-dedup pipelines over large video corpora

Research and develop distillation methods for large-scale diffusion and flow-based video generation models

including guidance distillation and adversarial distillation

with a focus on preserving or improving generation quality while reducing inference cost

Develop reward models and preference-based fine-tuning pipelines that align video generation quality with human judgments across dimensions such as aesthetics

Analyze the relationship between base model behavior and post-training outcomes

and work with the foundation model team to inform pretraining decisions accordingly

How You'll Work.

Team & Collaboration

collaborate closely with data, infrastructure, and adjacent modeling teams

Process & Methodology

Track record of independent research, with the ability to drive projects from initial idea through experimental validation

Full Job Description

About Cantina: Cantina Labs is a social AI company, developing a suite of advanced real-time models that push the boundaries of expression, personality, and realism. We bring characters to life, transforming how people tell stories, connect, and create. We build and power ecosystems. Cantina, our flagship social AI platform, is just the beginning. About the Role: Cantina is expanding, and we're looking for a Research Scientist to join our growing Singapore team! In this role, you will drive foundational research on video generation models, taking ownership across the full research cycle and driving post-training research. Furthermore, you'll collaborate closely with data, infrastructure, and adjacent modeling teams to translate research findings into durable model improvements. What You’ll Do: - Build and maintain scalable systems for ingesting, preprocessing, and delivering large-scale video data for model training - Design and scale distributed data pipelines for preprocessing, dataset generation, and repeated dataset refreshes - Own workflow orchestration, job scheduling, monitoring, and failure recovery for large-scale data processing jobs - Implement and maintain containerized pipeline infrastructure using Kubernetes or equivalent orchestration systems - Optimize cloud-based data storage and movement across providers (AWS, GCS, or Azure) for cost, throughput, and operational efficiency - Define and implement best practices for dataset storage layout, versioning, caching, retention, and access patterns - Build tooling to support deduplication workflows at scale, including near-dedup pipelines over large video corpora - Research and develop distillation methods for large-scale diffusion and flow-based video generation models, including guidance distillation and adversarial distillation, with a focus on preserving or improving generation quality while reducing inference cost - Develop reward models and preference-based fine-tuning pipelines that align video genera

Free ATS check

Applying for this Research Scientist role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 51 detected · ranked by frequency

workflow orchestration ×3

deduplication workflows ×3

building or scaling large-scale data systems or pipelines for machine learning workflows ×3

designing and scaling distributed data pipelines for preprocessing, dataset generation, and repeated dataset refreshes ×3

implementing and maintaining containerized pipeline infrastructure ×3

optimizing cloud-based data storage and movement ×3

building tooling to support deduplication workflows at scale ×3

research and develop distillation methods for large-scale diffusion and flow-based video generation models ×3

develop reward models and preference-based fine-tuning pipelines ×3

foundational research on video generation models ×2

post-training research ×2

large-scale data systems or pipelines for machine learning workflows ×2

distributed data pipelines ×2

containerized pipeline infrastructure ×2

cloud-based data storage and compute optimization ×2

distillation methods for large-scale diffusion and flow-based video generation models ×2

reward modeling and preference-based fine-tuning pipelines ×2

PySpark ×2

Ray ×2

Airflow ×2

Docker ×2

Kubernetes ×2

FFmpeg ×2

PyAV ×2

DALI ×2

OpenCV ×2

PyTorch ×2

JAX ×2

AWS

GCS

Azure

Python

BEHAVIOURAL

collaborationindependent researchability to drive projects from initial idea through experimental validation

Role Details

Experience 2–5 yrs

Level Mid

Type FULL TIME

Category research

AI-Extracted Insights

Domain Areas

social-aivideo-generation-modelsdiffusion-and-flow-based-video-generation-modelsmultimodal-or-media-dataincluding-videoimagetextand-audio

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Cantina Labs?

Real rants from real employees. Read before you apply.

Read Company Rants →