itD Tech

DataEngineerIII

$0–0k Menlo Park, California, United States CONTRACT Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Data Engineer III at itD Tech. Skills: Data Engineering, ML Engineering, AI Data Infrastructure, Image Generation Models. Design AI-augmented data pipelines. Build AI-augmented data pipelines”

Industry & Context.

Problems you'll solve

Failure handling; Troubleshooting

What They're Looking For.

Must Have

5+ years of experience in Data Engineering, ML Engineering, or hybrid role, Software engineering fundamentals, Python, Data structures, Concurrency, Asynchronous programming, Advanced SQL expertise, Complex query development, Query optimization, Large-scale data processing, Pipeline orchestration frameworks, Integrating machine learning models into production data pipelines, Inference endpoint management, Model versioning, Batching, Failure recovery, Building and operating production-scale data pipelines, Invoking machine learning models at scale, AI-assisted coding tools, Written communication skills, Verbal communication skills, Collaborate effectively across technical and business teams, Bachelor’s degree or higher in Computer Science, Data Engineering, Machine Learning, or related STEM field

Nice to Have

Experience generating, storing, indexing, and querying vector embeddings, Familiarity with content understanding models, Image classification, Object detection, OCR, NSFW detection, Aesthetic scoring systems, Leveraging LLMs for data annotation, data cleaning, evaluation, or prompt engineering workflows, Knowledge of generative AI technologies, Diffusion models, Image generation systems, Evaluation metrics, FID, CLIP Score, Previous experience leading AI-focused technology companies, Experience supporting large-scale image generation or multimodal AI initiatives

What You'll Do.

Design AI-augmented data pipelines

Build AI-augmented data pipelines

Maintain AI-augmented data pipelines

Combine data transformations with ML model inference

Develop systems for remote model inference orchestration

Optimize systems for remote model inference orchestration

Build scalable embedding generation pipelines

Build scalable embedding storage pipelines

Build scalable embedding indexing pipelines

Build scalable embedding retrieval pipelines

Curate large-scale image datasets

Manage large-scale image datasets

Design LLM-assisted annotation workflows

Operate LLM-assisted annotation workflows

Automate data labeling

Measure annotation quality

Improve annotation quality

Develop reusable frameworks

Develop pipeline components

Partner with engineers

Partner with researchers

Partner with stakeholders

Support image generation model development

Support image generation model evaluation

How You'll Work.

Team & Collaboration

Engineers; Researchers; Cross-functional stakeholders; Technical teams; Business teams

Communication Scope

Written communication; Verbal communication

Full Job Description

Data Engineer III itD is seeking a Senior AI Data Engineer III to build and scale AI-augmented data infrastructure that powers next-generation image generation models. This role sits at the intersection of Data Engineering and Machine Learning Systems, driving the development of large-scale data curation, annotation, and evaluation pipelines that improve model quality across visual quality, prompt adherence, identity preservation, naturalness, and visual text generation. The ideal candidate will bring deep expertise in AI-focused data engineering and a proven track record of building production-scale pipelines that integrate machine learning inference into data workflows. Location: Hybrid Onsite – Menlo Park, CA (required onsite collaboration with engineers and researchers) Pay Rate: $35 - $39 per hour, depending on experience. Duration: 5+ months We provide comprehensive medical benefits, a 401k plan, paid holidays, and more. Please note that we are only considering direct W2 candidates at this time, as we are unable to offer sponsorship. Responsibilities Design, build, and maintain AI-augmented data pipelines that combine traditional data transformations with machine learning model inference at billion-row scale. Develop and optimize systems for remote model inference orchestration, including batching, asynchronous execution, retry logic, throughput management, and graceful failure handling. Build and maintain scalable embedding generation, storage, indexing, and retrieval pipelines to support AI model training and evaluation. Curate and manage large-scale image datasets using SQL and model-derived signals, ensuring data quality, governance, compliance, and operational efficiency. Design and operate LLM-assisted annotation workflows that automate data labeling while measuring and improving annotation quality. Develop reusable frameworks, tooling, and pipeline components that enable broader engineering teams to efficiently build AI-powered data workflows. Partner c

Free ATS check

Applying for this Data Engineer III role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Greenhouse

  • Create a Greenhouse profile before applying — it saves time across multiple applications.
  • Upload your resume as a PDF; the parser handles it better than Word.
  • Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
  • Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about itD Tech?

Real rants from real employees. Read before you apply.

Read Company Rants →