robusta
Computer Software
SeniorDataEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Data Engineer at robusta. Skills: Data Lakehouse platform design and implementation, Data pipeline engineering, Distributed systems, Cloud infrastructure deployment, Technical leadership. Lead the technical design, implementation, and delivery of an enterprise-grade, AI-ready Data Lakehouse platform. Build the foundational data layer for a large-scale digital transformation initiative”
What You'll Achieve.
Deliver an enterprise-grade, AI-ready Data Lakehouse platform; Support AI agents, digital workers, and knowledge graph (ontology) systems; Enable advanced analytics and AI capabilities; Ensure long-term operational ownership; Ensure the client’s team is fully ready to operate the platform independently
Industry & Context.
Strong problem-solving skills with a keen eye for detail
Cloud infrastructure hosted within Saudi Arabia
What They're Looking For.
Must Have
5+ years of proven experience in Data Engineering, Distributed Systems, or Big Data Architecture, 2+ years specifically leading Data Lakehouse or Cloud Data Platform implementations, Proficiency in programming languages such as Python, Java, or Scala, expertise in designing data-intensive applications, complex data modeling, and schema design for enterprise environments, Deep, hands-on experience with distributed processing engines (e.g., Apache Spark, Kafka, Hadoop), Experience designing and building robust data pipelines using modern transformation and orchestration tools (e.g., Apache Airflow, Prefect, dbt), Proven track record with relational databases (e.g., PostgreSQL, MySQL), NoSQL platforms (e.g., MongoDB, Cassandra), and distributed SQL query engines like Hive and Trino, Proven experience deploying enterprise data solutions on major cloud providers, Strong problem-solving skills with a keen eye for detail and a passion for data, Strong understanding of enterprise network security, Private Endpoints, Identity & Access Management (IAM), and cryptographic key management, Excellent written and verbal communication skills, Demonstrated ability to lead technical teams, manage stakeholder expectations, and successfully transition complex systems to internal IT/Data teams
Nice to Have
AI-ready Data Lakehouse platform experience, AI agents, digital workers, and knowledge graph (ontology) systems support experience, Experience with modern open table formats (e.g., Delta Lake, Apache Iceberg, Apache Hudi), Experience with Oracle Cloud Infrastructure (OCI) or Google Cloud Platform (GCP), Prior experience building data pipelines optimized for Machine Learning, Natural Language Processing (NLP), vector embeddings, or Knowledge Graphs/Ontologies is highly desirable, Familiarity with Saudi Arabian data compliance frameworks (NCA CCC, NDMO, SDAIA) is highly preferred
What You'll Do.
Lead the technical design
and delivery of an enterprise-grade
AI-ready Data Lakehouse platform
Build the foundational data layer for a large-scale digital transformation initiative
Design and deploy a unified Data Lakehouse utilizing the Medallion architecture (Bronze
Gold) and open table formats on cloud infrastructure hosted within Saudi Arabia
automated ingestion frameworks (batch and streaming) capable of processing both structured data and unstructured data
Implement automated data quality 'circuit breakers' and end-to-end data lineage tracking frameworks
Optimize data processing workflows for performance
Monitor and maintain data systems
responding to SEVs or other urgent issues to ensure continuous operations
Ensure the platform adheres strictly to NCA and NDMO standards
Implement AES-256 encryption at rest
robust Key Management Systems (KMS)
and centralized audit logging
Design and deploy granular Role-Based Access Control (RBAC) and Attribute-Based Access Control (ABAC)
integrating seamlessly with existing enterprise Identity Providers
Lead hands-on knowledge transfer sessions
pair-programming with client engineers
creating operational runbooks
and conducting 'Game Day' failure simulations
How You'll Work.
Team & Collaboration
Mentoring engineering teams in a collaborative co-building model; Pair-programming with client engineers; Ensuring the client’s team is fully ready to operate the platform independently
Communication Scope
Excellent written and verbal communication skills; Ability to articulate complex technical concepts to non-technical stakeholders
Process & Methodology
Manage stakeholder expectations, Successfully transition complex systems to internal IT/Data teams
Full Job Description
Robusta assists organizations in transitioning to a digital-first approach, crafting unforgettable experiences for their customers. We provide strategy, design, product, and technology services to prominent businesses and brands, utilizing our go-to-market expertise to facilitate seamless customer experiences and enhance conversion rates. ### **About the Role** We are seeking a highly experienced **Senior Data Engineer** to lead the technical design, implementation, and delivery of an enterprise-grade, AI-ready **Data Lakehouse platform**. This role is critical in building the foundational data layer for a large-scale digital transformation initiative that will support **AI agents, digital workers, and knowledge graph (ontology) systems**. The ideal candidate will have strong software engineering experience with a focus on **data pipeline development, data architecture, and scalable distributed systems**. You will play a key role in designing and maintaining robust data infrastructure that enables advanced analytics and AI capabilities. This position also involves **technical leadership** , mentoring engineering teams in a collaborative co-building model, and ensuring long-term operational ownership. ### Key Responsibilities * **Lakehouse Architecture & Implementation: ** Design and deploy a unified Data Lakehouse utilizing the Medallion architecture (Bronze, Silver, Gold) and open table formats (e.g., Delta Lake, Apache Iceberg) on cloud infrastructure hosted within Saudi Arabia. * **Data Ingestion & Pipeline Engineering: ** Build reusable, automated ingestion frameworks (batch and streaming) capable of processing both structured data (RDBMS, APIs) and unstructured data (PDFs, policy documents) to feed downstream AI models and semantic reasoning engines. * **Data Quality & Governance:** Implement automated data quality "circuit breakers" (completeness, uniqueness, referential integrity) and end-to-end data lineage tracking frameworks. * **Optimization: **Optimize d
Applying for this Senior Data Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about robusta?
Real rants from real employees. Read before you apply.