Software Mind
Technology
SeniorDataEngineer(AIIngestionPlatform)
Neural analysis suggests this role is
optimal for mid candidates.
“Senior Data Engineer (AI Ingestion Platform) at Software Mind. Skills: Data Engineering, AI Ingestion, ETL/ELT, Vector Databases. Build historical email ingestion pipeline. Implement SharePoint document ingestion pipeline”
Industry & Context.
What They're Looking For.
Must Have
6+ years data pipeline, 6+ years ETL/ELT experience, Proficiency in Python, Experience with Microsoft Graph API, Experience with AWS data services, Familiarity with PII detection, Familiarity with data minimisation, Experience with vector store indexing, Experience with semantic search pipeline
Nice to Have
Prior experience building ingestion pipelines for AI/ML, Prior experience building ingestion pipelines for NLP, Prior experience building ingestion pipelines for LLM-based platforms, OCR tooling experience, Understanding of per-tenant data isolation, Understanding of tenant-scoped encryption, Understanding of row-level security, Familiarity with LangChain document loaders, Familiarity with LangChain embedding pipelines, Familiarity with LangChain vector index management
What You'll Do.
Build historical email ingestion pipeline
Implement SharePoint document ingestion pipeline
Implement OneDrive document ingestion pipeline
Design PII minimisation pre-processing layer
Implement PII minimisation pre-processing layer
Build vector store indexing workflow
Define data processing schema
Implement data processing schema
Build OCR routing orchestrator
Integrate OCR service
Implement raw text extraction layer
Implement content extraction layer
Define push ingestion strategy
Define pull ingestion strategy
Prototype ingestion pipeline
Ensure data lineage built-in
Ensure audit traceability built-in
How You'll Work.
Team & Collaboration
Cross-functional team
Full Job Description
We are Software Mind, an awesome team of engineers who are ready to ramp up any top-notch company’s projects! Our aim? To always be one step ahead. Become part of a multicultural company in constant growth with an excellent work environment certified by Great Place To Work! About the Project Software Mind is building a private, tenant-isolated AI assistant for the real estate title and settlement industry. The platform is a retrieval-first (RAG) system that ingests historical email, documents, and structured metadata into a per-tenant vector index, and serves grounded, cited, expert-weighted answers through a chat-style Q&A interface with single sign-on and full audit logging. The platform is AWS-native with a Python/FastAPI backend, Vue.js frontend, OpenSearch/Pinecone vector store, and OpenAI/Anthropic/Bedrock as LLM provider. You will join a senior, cross-functional LATAM-based team where hands-on AI delivery experience not just familiarity is the baseline expectation. You own the ingestion and processing backbone of the platform the pipelines that transform raw email and document corpora into clean, PII-minimised, chunked, and indexed data in the per-tenant vector store. This is the foundational layer the AI extraction gateway depends on; quality here directly determines system accuracy. Your Responsibilities * Build and own the historical email ingestion pipeline via Microsoft Graph API * Implement SharePoint / OneDrive document ingestion pipeline with scoped folder access * Design and implement the PII minimisation pre-processing layer * Build the vector store indexing workflow (OpenSearch/Pinecone) with per-tenant data isolation * Define and implement the data processing schema; produce and maintain schema documentation * Build the OCR routing orchestrator and integrate OCR service for scanned documents * Implement the raw text / content extraction layer for all supported document types * Define and prototype push vs. pull ingestion strategy, from one-time Po
Applying for this Senior Data Engineer (AI Ingestion Platform) role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on SmartRecruiters
- SmartRecruiters often includes a video screening step — check camera and mic permissions.
- Link your GitHub or portfolio directly in the profile section for technical roles.
- Applications may be reviewed by AI scoring before reaching a recruiter — use keywords from the job description.
ANONYMOUS · UNFILTERED
What do employees actually say about Software Mind?
Real rants from real employees. Read before you apply.