Reducto

AI

MachineLearningInfraEngineer

$150–300k San Francisco, California, United States; New York City, New York, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“Machine Learning Infra Engineer at Reducto. Skills: Machine Learning Infra Engineering, training and inference stack development, distributed systems, scalability, observability. Build, and maintain our training and inference stack with an emphasis for fast iteration on training + flexibility for exploring new methods and high performance in inference. Develop benchmarks for both sets of stacks to identify bottlenecks”

What You'll Achieve.

deliver results at scale; define what high performance ML training and inference look like at Reducto; help ML engineers move faster from experiment to production

Industry & Context.

AI
Problems you'll solve

Enjoy solving complex problems; building from first principles; getting your hands dirty with real-world implementation challenges

Eligibility Requirements

This is an in person role at our office in SF, working hard and moving quickly

What They're Looking For.

Must Have

Python skills, background in systems engineering, Kubernetes, distributed training frameworks

Nice to Have

experience at an early-stage or high-growth startup, developed in open source training/inference stacks in a meaningful way, excited to set up distributed inference across 100s-1000s of GPUs, combining technical excellence with business impact

What You'll Do.

and maintain our training and inference stack with an emphasis for fast iteration on training + flexibility for exploring new methods and high performance in inference

Develop benchmarks for both sets of stacks to identify bottlenecks

Explore SOTA advances in training and inference and work to apply them

Design systems for scaling model training across multi-node

multi-GPU environments with reliability and observability

Scale distributed training and inference workloads across large GPU clusters while improving utilization

and observability that help ML engineers move faster from experiment to production

How You'll Work.

Team & Collaboration

Collaborate closely with our ML and Platform teams; Collaborate effectively across technical and non-technical teams

Process & Methodology

Take full ownership from strategy through execution

Full Job Description

ABOUT REDUCTO Reducto helps AI teams ingest real world enterprise data with state of the art accuracy. The vast majority of enterprise data — from financial statements to health records — is locked in unstructured file formats like PDFs and spreadsheets. We train vision models to read those documents the way a human would, and make it possible to build products, train models, and automate processes at scale. We’ve grown incredibly quickly, growing revenue by 7x YOY, and now work with hundreds of companies ranging from leading AI teams (Harvey, Vanta, Scale), through to enterprise (FAANG, top 3 trading firm). We're raised over 100M from world class investors like A16z, Benchmark, and First Round Capital, and are hiring a Machine Learning Engineer to help us train and deploy the models critical to the performance of our core product. THE OPPORTUNITY As an ML Infra Engineer, you’ll play a key role in building the inference and training frameworks that make it possible to deliver results at scale. You’ll collaborate closely with our ML and Platform teams to scale training across nodes, develop faster and more efficient serving, and create observability across the stack. This is a high-impact role where you’ll help define what high performance ML training and inference look like at Reducto.   WHAT YOU’LL DO - Build, and maintain our training and inference stack with an emphasis for fast iteration on training + flexibility for exploring new methods and high performance in inference. - Develop benchmarks for both sets of stacks to identify bottlenecks. - Explore SOTA advances in training and inference and work to apply them. - Design systems for scaling model training across multi-node, multi-GPU environments with strong reliability and observability. - Scale distributed training and inference workloads across large GPU clusters while improving utilization, reliability, and cost efficiency. - Build the tooling, abstractions, and observability that help ML engineers move fa

Free ATS check

Applying for this Machine Learning Infra Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Ashby

  • Ashby is a fast modern ATS — most applications take under 3 minutes.
  • The resume parser is strong; verify parsed experience dates and job titles.
  • Custom screening questions are often scored algorithmically — answer completely.
  • Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Reducto?

Real rants from real employees. Read before you apply.

Read Company Rants →