Mach9

Engineering

DataEngineer

$135–185k ~AI est. San Francisco, California, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Data Engineer at Mach9. Skills: Data Engineering, Geospatial data, ML pipelines. Develop workflows. Maintain workflows”

Industry & Context.

Engineering
Problems you'll solve

Problem-solving; Debugging; Puzzle-hunting

Eligibility Requirements

Occasional travels

What They're Looking For.

Must Have

Hands-on experience building production systems in Python, Solid foundation in distributed systems, Solid foundation in parallel computing, Comfort operating with ambiguity, Experience building agentic systems, Experience setting up agent harnesses, Communication skills, Collaboration skills

Nice to Have

Experience building agentic systems, Experience setting up agent harnesses, Orchestrating LLM-driven workflows, Understanding of geospatial data formats, Experience with LAS/LAZ, Experience with COPC, Experience with E57, Experience with GeoTIFF, Experience with Shapefiles, Experience with GDAL, Experience with PDAL, Experience with untwine, Experience with laz-perf, Expertise designing data schemas, Expertise managing data schemas, Expertise designing storage systems, Expertise managing storage systems, Experience with Postgres/PostGIS, Experience with AWS S3, Experience with large-scale data processing frameworks, Experience with cloud platforms, Experience with Spark, Experience with AWS Batch, Familiarity with coordinate reference systems, Familiarity with CRS transforms, Familiarity with WKT, Familiarity with pyproj, Familiarity with affine transforms, Experience building data versioning systems, Experience building data lineage systems, Experience building artifact-tracking systems, Experience operating data pipelines, Familiar with C++

What You'll Do.

Build CI/CD pipelines

Optimize processing performance

Optimize storage efficiency

Resolve customer issues

Unblock customer projects

Build agentic harness

Automate dataset triage

Automate code patching

Facilitate data integration

Work with data formats

How You'll Work.

Team & Collaboration

ML teams; Product teams; Customer success team; Customers; Data-provider partners

Full Job Description

THE ROLE We're seeking a Data Engineer to transform large-scale geospatial datasets into structured, reliable, and accessible formats that power Mach9's ML and product pipelines. You'll work with high-volume data sources — laser scan point clouds, imagery, and a long tail of geospatial formats — and own the systems that get them ingested, standardized, stored, and made available for training, perception, and production use in a consistent and efficient way. This role sits at the front of everything we do: our models are only as good as the data feeding them, and you'll be the one making that data trustworthy at scale. RESPONSIBILITIES - Develop and maintain scalable, reproducible workflows for ingesting and processing large volumes of point cloud, imagery, and geospatial data. - Convert datasets from various sensor providers into Mach9's standardized internal formats. - Build CI/CD pipelines and automated checks that guarantee the correctness and consistency of data pipelines, including regression detection on dataset processing. - Optimize processing performance, query speed, and storage efficiency across large geospatial datasets. - Work closely with the customer success team to efficiently resolve issues and unblock customer projects. - Build and maintain agentic harness for automated dataset triage and code patching. Automatically propose or apply fixes, and escalate when human judgment is needed. - Work closely with ML and product teams to make data readily usable for training, inference and visualization. - Work closely with customers and data-provider partners to facilitate data integration (with occasional travels). - Puzzle-hunting: work with data formats with sparse or missing documentation. REQUIREMENTS - Strong software development, problem-solving, and debugging skills, with hands-on experience building production systems in Python. - Solid foundation in distributed systems and parallel computing. - Comfort operating with ambiguity — able to dig into un

Free ATS check

Applying for this Data Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Ashby

  • Ashby is a fast modern ATS — most applications take under 3 minutes.
  • The resume parser is strong; verify parsed experience dates and job titles.
  • Custom screening questions are often scored algorithmically — answer completely.
  • Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Mach9?

Real rants from real employees. Read before you apply.

Read Company Rants →