Company
Technology
LeadDataEngineerwithAIexperience
Neural analysis suggests this role is
optimal for Lead candidates.
“Lead Data Engineer with AI experience. Skills: Data Engineering, AI Infrastructure, LLM Systems, Agentic Systems. Build batch data pipelines. Optimize batch data pipelines”
Industry & Context.
Problem-solving mindset; Design scalable systems
What They're Looking For.
Must Have
7+ years of experience in data engineering, 2+ years of experience building production AI/ML or LLM-related data infrastructure, Expertise in Python, SQL, PySpark, Snowflake, Delta Lake, Kafka, and Spark Structured Streaming, Hands-on experience with vector databases, embedding pipelines, and retrieval systems in production RAG environments, Solid understanding of MLOps practices, Knowledge of data governance, security, compliance, and data quality frameworks, Experience working with cloud ecosystems such as AWS or Azure, Experience with containerized environments (Docker, Kubernetes)
Nice to Have
Familiarity with AI/LLM tooling such as LangChain, LlamaIndex, OpenAI/Claudeedrock APIs, and FastAPI
What You'll Do.
Build batch data pipelines
Optimize batch data pipelines
Maintain batch data pipelines
Build streaming data pipelines
Optimize streaming data pipelines
Maintain streaming data pipelines
Design retrieval systems
Implement retrieval systems
Develop entity mappings
Develop knowledge graphs
Maintain semantic contracts
Maintain metadata systems
Maintain lineage tracking
Support ML lifecycle workflows
Support LLM lifecycle workflows
Build APIs for agents
Build context stores for agents
Build tool interfaces for agents
Implement data governance frameworks
Implement PII handling
Implement schema validation
Implement data quality monitoring
Implement compliance-ready audit logging
How You'll Work.
Team & Collaboration
Global engineering teams; Enterprise-scale AI transformation projects
Process & Methodology
Agile
Full Job Description
## Accountabilities Data Pipeline Engineering: Build, optimize, and maintain robust batch and streaming data pipelines using modern cloud-native tools such as Snowflake, PySpark, Delta Lake, and Kafka, ensuring reliability, scalability, and performance. RAG & Retrieval Infrastructure: Design and implement end-to-end retrieval systems including embedding pipelines, vector databases, hybrid search, chunking strategies, and ranking mechanisms to optimize AI context relevance. Semantic & Knowledge Layer Development: Develop ontologies, entity mappings, and knowledge graphs while maintaining semantic contracts, metadata systems, and lineage tracking for AI and ML use cases. ML/LLMOps Enablement: Support ML and LLM lifecycle workflows including dataset curation, feature engineering, model evaluation, experiment tracking, and production monitoring. Agentic Data Systems: Build APIs, context stores, and tool interfaces that enable autonomous agents, including observability for reasoning traces, tool calls, and contextual outputs. Governance & Data Quality: Implement robust data governance frameworks including RBAC, PII handling, schema validation, data quality monitoring, and compliance-ready audit logging systems. Requirements This role requires a highly experienced data engineering professional with strong cloud, distributed systems, and AI infrastructure expertise. The ideal candidate combines deep technical execution with architectural thinking and hands-on experience building production-grade AI-enabled data systems. 7+ years of experience in data engineering with strong exposure to cloud-based data platforms. 2+ years of experience building production AI/ML or LLM-related data infrastructure at scale. Strong expertise in Python, SQL, PySpark, Snowflake, Delta Lake, Kafka, and Spark Structured Streaming. Hands-on experience with vector databases, embedding pipelines, and retrieval systems in production RAG environments. Solid understanding of MLOps practices including M
Applying for this Lead Data Engineer with AI experience role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Lever
- Lever uses a streamlined one-page form — apply in under 5 minutes.
- LinkedIn import works well; review parsed data before submitting.
- The cover letter field is optional but visible to reviewers — use it to differentiate.
- Referral codes from employees can significantly boost visibility of your application.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.