Company
Technology
Sr.AIDataEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Sr. AI Data Engineer. Skills: AI Data Engineering, Data pipelines, ML model inference, Generative AI. Design AI-augmented data pipelines. Maintain AI-augmented data pipelines”
Industry & Context.
What They're Looking For.
Must Have
Bachelor's degree or higher, 5+ years of experience, Expertise in SQL, Expertise in data pipeline orchestration tools, Expertise in large-scale distributed systems, Hands-on experience integrating ML models, Programming and debugging skills
Nice to Have
Experience with embeddings, Experience with vector databases, Experience with similarity search systems, Familiarity with content understanding models, Exposure to LLM-based workflows, Knowledge of generative AI concepts
What You'll Do.
Design AI-augmented data pipelines
Maintain AI-augmented data pipelines
Own remote inference orchestration
Build embedding pipelines
Manage embedding pipelines
Curate training datasets
Govern training datasets
Develop automated annotation systems
Contribute to shared engineering frameworks
Contribute to reusable tooling
Ensure pipeline reliability
Ensure pipeline compliance
Collaborate with ML researchers
Collaborate with ML engineers
How You'll Work.
Team & Collaboration
Cross-functional technical environments
Full Job Description
## Accountabilities Design and maintain large-scale, AI-augmented data pipelines that combine SQL transformations with ML model invocations for data cleaning, labeling, and enrichment. Own end-to-end remote inference orchestration, including batching, asynchronous execution, retry logic, failure handling, and performance optimization. Build and manage scalable embedding pipelines, including vector generation, storage, indexing, and similarity search infrastructure. Curate and govern large-scale training datasets for image generation models using model-driven signals such as classifiers, aesthetic scoring, and content filters. Develop automated annotation systems using LLMs and vision models, including evaluation frameworks to measure annotation quality and model performance. Contribute to shared engineering frameworks and reusable tooling for AI-driven data workflows and pipeline orchestration. Ensure pipeline reliability, compliance, and data quality across billions of records in distributed production systems. Collaborate with ML researchers and engineers to improve dataset quality, evaluation metrics, and generative model performance. Requirements: Bachelor’s degree or higher in Computer Science, Data Engineering, Machine Learning, or a related STEM field. 5+ years of experience in data engineering, ML engineering, or hybrid roles involving data pipelines and model inference systems. Strong expertise in SQL, data pipeline orchestration tools (e.g., Airflow, Dataswarm), and large-scale distributed systems. Hands-on experience integrating ML models into production pipelines, including inference APIs, batching, and failure handling. Experience with AI-assisted development tools (e.g., Copilot, Cursor, Codex) to accelerate engineering workflows. Strong programming and debugging skills with a focus on scalable data systems and production reliability. Experience with embeddings, vector databases, or similarity search systems (e.g., FAISS, Milvus) is highly desirable. F
Applying for this Sr. AI Data Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Lever
- Lever uses a streamlined one-page form — apply in under 5 minutes.
- LinkedIn import works well; review parsed data before submitting.
- The cover letter field is optional but visible to reviewers — use it to differentiate.
- Referral codes from employees can significantly boost visibility of your application.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.