Company

Research

SeniorResearchEngineer

$175–250k San Francisco, California, United States FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Senior Research Engineer. Skills: memory features lifecycle, model fine-tuning, research implementation, evaluation at scale, customer pain point analysis, productionization. Own the end-to-end lifecycle of memory features—from research to production. Fine-tune models for extraction, updates, consolidation/forgetting, and conflict resolution”

What You'll Achieve.

Ship with Engineering to SOTA latency, reliability, and cost; Continuously improve quality; Productionize what wins; Validate solutions through field trials; Maintain SOTA latency, reliability, and cost at scale

Industry & Context.

Research

Problems you'll solve

turn customer pain points into research implement and benchmark ideas; turn pain points into research hypotheses

What They're Looking For.

Must Have

Experience in RAG or information retrieval (retrieval, ranking, query understanding) for real products, Model training/fine-tuning experience (LLMs/encoders) with a footing in experimental design and iteration, deep experience with PyTorch, Built evaluation for complex vision-and-language tasks (gold sets, offline metrics, online tests), Able to orchestrate data pipelines to run these models in production with low-latency SLAs (batch + streaming), Clear, concise communication with stakeholders (engineering, product, GTM, and customers)

Nice to Have

Publications at venues like CVPR, NeurIPS, ICML, ACL, etc., Experience with privacy-preserving ML (redaction, differential privacy, data governance), Deep familiarity with memory/retrieval literature or prior work on memory systems, Expertise with embeddings, vector-DB internals, deduplication, and contradiction detection

What You'll Do.

Own the end-to-end lifecycle of memory features—from research to production

Fine-tune models for extraction

consolidation/forgetting

and conflict resolution

Turn customer pain points into research implement and benchmark ideas

Ship with Engineering to SOTA latency

Build evaluation at scale (offline metrics + online As)

Close the loop with real-world feedback to continuously improve quality

and implement research

Quickly prototype paper ideas

Benchmark against baselines

Productionize what wins

Work closely with customers to uncover pain points

Turn pain points into research hypotheses

Validate solutions through field trials

Design APIs and data contracts

Maintain SOTA latency

How You'll Work.

Team & Collaboration

Partner with Engineering to ship; Work closely with customers; Clear, concise communication with stakeholders (engineering, product, GTM, and customers)

Communication Scope

Clear, concise communication with stakeholders (engineering, product, GTM, and customers)

Full Job Description

Role Summary: Own the end-to-end lifecycle of memory features—from research to production. You’ll fine-tune models for extraction, updates, consolidation/forgetting, and conflict resolution; turn customer pain points into research hypotheses; implement and benchmark ideas from papers; and ship with Engineering to SOTA latency, reliability, and cost. You’ll also build evaluation at scale (offline metrics + online A/Bs) and close the loop with real-world feedback to continuously improve quality. What You'll Do: - Fine-tune and train models for memory extraction, updates, consolidation/forgetting, and conflict resolution; iterate based on data and outcomes. - Read, reproduce, and implement research: quickly prototype paper ideas, benchmark against baselines, and productionize what wins. - Build evaluation at scale: automated relevance/accuracy/consistency metrics, gold sets, online A/B & interleaving, and clear dashboards. - Work closely with customers to uncover pain points, turn them into research hypotheses, and validate solutions through field trials. - Partner with Engineering to ship: design APIs and data contracts, plan safe rollouts, and maintain SOTA latency, reliability, and cost at scale. Minimum Qualifications - Experience in RAG or information retrieval (retrieval, ranking, query understanding) for real products. - Model training/fine-tuning experience (LLMs/encoders) with a strong footing in experimental design and iteration. - Strong Python; deep experience with PyTorch and familiarity with vLLM and modern serving frameworks. - Built evaluation for complex vision-and-language tasks (gold sets, offline metrics, online tests). - Able to orchestrate data pipelines to run these models in production with low-latency SLAs (batch + streaming). - Clear, concise communication with stakeholders (engineering, product, GTM, and customers). Nice to Have: - Publications at venues like CVPR, NeurIPS, ICML, ACL, etc. - Experience with privacy-preserving ML (redaction, d

Free ATS check

Applying for this Senior Research Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 39 detected · ranked by frequency

model training ×3

fine-tuning ×3

prototyping ×3

benchmarking ×3

productionizing ×3

evaluation ×3

data pipelines orchestration ×3

API design ×3

data contract design ×3

safe rollouts ×3

privacy-preserving ML ×3

redaction ×3

differential privacy ×3

memory features lifecycle ×2

model fine-tuning ×2

research implementation ×2

evaluation at scale ×2

customer pain point analysis ×2

productionization ×2

PyTorch ×2

vLLM ×2

modern serving frameworks ×2

RAG

information retrieval

retrieval

ranking

query understanding

LLMs

encoders

vision-and-language tasks

data pipelines

embeddings

BEHAVIOURAL

communication

Role Details

Experience 5–10 yrs

Level Senior

Type FULL TIME

Category research

Salary Band 150k-200k

AI-Extracted Insights

Domain Areas

memory-systemsmemory-retrieval-literature

How to Apply on Ashby

Ashby is a fast modern ATS — most applications take under 3 minutes.
The resume parser is strong; verify parsed experience dates and job titles.
Custom screening questions are often scored algorithmically — answer completely.
Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about this company?

Real rants from real employees. Read before you apply.

Read Company Rants →