Reka

Technology

MemberofTechnicalStaff(DataIntelligence)

$170–250k ~AI est. United States; United Kingdom; Singapore FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Member of Technical Staff (Data Intelligence) at Reka. Skills: Data Intelligence, ML fundamentals, Large-scale systems. Define data quality metrics. Define validation checks”

Industry & Context.

Technology

Problems you'll solve

Run analyses; Dig into data

What They're Looking For.

Must Have

ML and deep learning fundamentals, Experience building large-scale systems, Experience operating large-scale systems, Solid Python skills

Nice to Have

Experience with large video datasets, Experience with dataset curation, Experience building internal tooling

What You'll Do.

Define data quality metrics

Define validation checks

Define acceptance thresholds

Explore open source datasets

Create internal datasets

Build algorithms for data quality

Build algorithms for data domain mixtures

Build algorithms for domain adaptation

Own CI/CD for data stack

Own development tooling for data stack

Automate repetitive workflows

Track compute utilization

How You'll Work.

Team & Collaboration

Model researchers; Data infrastructure engineers; Cross-functional partners

Full Job Description

In this role, you’ll work closely with model researchers, data infrastructure engineers, and cross-functional partners to make sure our data is high quality and can be produced at petabyte scale in a reliable, efficient way. From understanding how data choices show up in model behavior, to building processing pipelines and running the compute behind them, you’ll help ensure our models are trained on the best data we can get. WHAT YOU’LL DO - Work with model researchers to define what “good data” means for our models, including quality metrics, validation checks, and acceptance thresholds - Explore open source datasets and create internal ones most suitable to build fundamental World Models - Build algorithms for automated data quality assessment, data domain mixtures, and domain adaptation from synthetic to real data. - Track datasets, metadata, provenance, and versions so experiments are reproducible and it’s clear what data went into which training and evaluation runs - Own CI/CD and development tooling for the data stack (GitHub, Python, PyTorch), and automate repetitive workflows to reduce friction - Track and optimize throughput, storage, and compute utilization across pipelines and related assets WHAT WE’RE LOOKING FOR - Strong ML and deep learning fundamentals with experience building and operating large-scale data and/or compute systems - Comfortable moving between research questions and production engineering: you can dig into data, run analyses, and also ship reliable systems - Demonstrated research experience with data compositions, quality, and dataset releases - Ability to design and execute experiments with convincing unbiased outcomes - Practical experience with distributed processing and orchestration (Spark, Ray, Airflow, or equivalents) - Solid Python skills, and familiarity with the tooling around modern model training workflows (datasets, checkpoints, experiment tracking) - Strong instincts around data quality: how to measure it, how to monitor i

Free ATS check

Applying for this Member of Technical Staff (Data Intelligence) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Reka?

Real rants from real employees. Read before you apply.

Read Company Rants →