Scaled Cognition

Engineering

AIQAEngineer(Multilingual)

Mountain View, California, United States FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“AI QA Engineer (Multilingual) at Scaled Cognition. Skills: AI QA, LLM training data quality, evaluation sets, data inspection, automation, cross-lingual datasets. Meticulously inspect, review, and grade LLM training data, evaluation test cases, and model outputs to ensure maximum quality and accuracy. Maintain local development environments to run test pipelines, investigate edge cases, and submit PRs via Git/GitHub to update our training repositories”

What You'll Achieve.

ensuring our LLM training data and evaluation sets are flawless; quality that directly drives model performance

Industry & Context.

Engineering

Problems you'll solve

technical data detective, diving deep into training data to spot error cases

What They're Looking For.

Must Have

technical background with hands-on coding experience (Python preferred), proficiency with Git/GitHub, Fluency in English and native or near-native proficiency in at least one other language, Deep understanding of Large Language Models, their failure modes (hallucinations, formatting errors), and effective prompting techniques, Proven experience in Quality Assurance, Data Quality, or Data Engineering, with a track record of auditing and maintaining large datasets, Exceptional written communication skills across multiple languages

What You'll Do.

and grade LLM training data

evaluation test cases

and model outputs to ensure maximum quality and accuracy

Maintain local development environments to run test pipelines

investigate edge cases

and submit PRs via Git/GitHub to update our training repositories

Act as a technical data detective

diving deep into training data to spot error cases

Leverage LLMs as internal tools to translate

and maintain our cross-lingual datasets

Collaborate closely with the engineering team to refine our evaluation criteria and improve our data pipelines

How You'll Work.

Team & Collaboration

Collaborate closely with the engineering team to refine our evaluation criteria and improve our data pipelines

Communication Scope

Exceptional written communication skills across multiple languages

Full Job Description

AI QA Engineer (Multilingual) As an AI QA Engineer (Multilingual) at Scaled Cognition, you will be the final line of defense for our model's quality. You'll sit at the critical intersection of data engineering, quality assurance, and linguistics, ensuring our LLM training data and evaluation sets are flawless. You'll be getting your hands dirty, meticulously inspecting data, and making direct code contributions to fix issues. If you love the idea of turning messy, imperfect data into gold and have the technical chops to automate parts of that cleanup, you will thrive here. What you'll do: - Meticulously inspect, review, and grade LLM training data, evaluation test cases, and model outputs to ensure maximum quality and accuracy. - Maintain local development environments to run test pipelines, investigate edge cases, and submit PRs via Git/GitHub to update our training repositories. - Act as a technical data detective, diving deep into training data to spot error cases. - Leverage LLMs as internal tools to translate, verify, and maintain our cross-lingual datasets. - Collaborate closely with the engineering team to refine our evaluation criteria and improve our data pipelines. You might be the right person for the job if you: - Have an obsessive attention to detail and get a dopamine hit from finding the one edge case or bad translation that broke a prompt. - Are a builder who doesn't mind the weeds. You understand that high-quality AI is built on rigorous, sometimes repetitive data inspection, and you embrace that reality. - Are technically self-sufficient. You’re comfortable navigating a terminal, running Python scripts locally, and managing your own version control. - Love languages and understand the linguistic nuances required for high-quality translation and cross-lingual model evaluation. - Thrive in a fast-paced environment where you can take ownership of the data quality that directly drives model performance. Key Qualifications: - Strong technical background

Free ATS check

Applying for this AI QA Engineer (Multilingual) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 24 detected · ranked by frequency

coding experience ×3

running Python scripts locally ×3

managing version control ×3

translation ×3

verification ×3

prompting techniques ×3

AI QA ×2

LLM training data quality ×2

evaluation sets ×2

data inspection ×2

automation ×2

cross-lingual datasets ×2

Git ×2

GitHub ×2

Python

LLMs

Quality Assurance

Data Quality

Data Engineering

auditing and maintaining large datasets

linguistic nuances

cross-lingual model evaluation

terminal

Python scripts

BEHAVIOURAL

obsessive attention to detailtechnical self-sufficientLove languagesThrive in a fast-paced environmenttake ownership

Role Details

Type FULL TIME

Category engineering

AI-Extracted Insights

Domain Areas

large-language-modelstheir-failure-modes-hallucinationsformatting-errorseffective-prompting-techniqueslinguistic-nuances-required-for-high-quality-translation-and-cross-lingual-model-evaluation

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Scaled Cognition?

Real rants from real employees. Read before you apply.

Read Company Rants →