DISQO
Tech / AI / Software
StaffDataEngineer(Scala,Spark,&GenAI)
Neural analysis suggests this role is
optimal for Lead candidates.
“Staff Data Engineer (Scala, Spark, & Gen AI) at DISQO. Skills: Scala, Spark, Generative AI, Data Engineering, AWS. Architect and Lead data pipelines. Design, build, and maintain highly scalable, fault-tolerant data pipelines using expert-level Scala and Apache Spark”
Industry & Context.
solve complex, real-world problems at scale; tackle our hardest scalability challenges; resolve complex performance bottlenecks, memory issues, and data skew
What They're Looking For.
Must Have
8+ years of experience building, architecting, and supporting complex production data pipelines, distributed systems, and backend infrastructure, Deep, hands-on expertise in Scala and Apache Spark, Proven experience integrating Gen AI / LLMs (e.g., OpenAI APIs, Anthropic, Bedrock) into data products or data engineering workflows, Hands on experience developing with AI dev tools such as Claude code, etc, Proficiency in Python specifically to interface with modern AI ecosystems, data APIs, and orchestration tools, Extensive architectural experience within the AWS ecosystem (EMR, Glue, Athena, S3, Bedrock, etc.), Deep understanding of advanced ETL/ELT concepts, complex data modeling, and performance-tuning SQL, Expert-level experience with workflow orchestration tools such as Airflow, Proven track record of leading technical initiatives, making architectural decisions, and mentoring teams in an agile, fast-moving environment
Nice to Have
Experience with Snowflake or other modern cloud data warehouses, Deep exposure to streaming or real-time event processing (Kafka, Flink, Kinesis, etc.), Experience utilizing AI for automated data observability, anomaly detection, or data quality tooling, Background in ad tech, measurement, attribution modeling, or specialized analytics platforms
What You'll Do.
Architect and Lead data pipelines
and maintain highly scalable
fault-tolerant data pipelines using expert-level Scala and Apache Spark
Pioneer the use of Generative AI within our data ecosystem
Incorporate LLMs to enrich datasets
extract value from unstructured data
automate metadata generation
and build intelligent data products
Partner with Product and Engineering leadership to translate complex business requirements into forward-looking data and AI-augmented architectures
Architect and aggressively optimize large-scale ETL/ELT workflows
Dive deep into Spark internals to resolve complex performance bottlenecks
Implement and manage infrastructure to support AI integration
including vector databases
and Retrieval-Augmented Generation (RAG) architectures
and maintainable code
Establish standards for code quality
and system architecture across the organization
Champion data quality
and system health to consistently meet enterprise SLAs and customer commitments
Actively mentor engineers
lead technical design reviews
and foster a culture of continuous learning and technical rigor
How You'll Work.
Team & Collaboration
working closely with engineering leadership, product managers, and analysts in a collaborative environment; Partner with Product and Engineering leadership; lead cross-functional technical initiatives; mentor senior and mid-level engineers; lead technical design reviews
Process & Methodology
agile development practices, leading technical initiatives
Full Job Description
## Description DISQO’s mission is to build the world’s most trusted ad measurement platform that fuels brand growth. The world’s largest brands, agencies, and media companies trust DISQO for expert insight and AI-driven intelligence about their advertising performance across all platforms. We capture people’s sentiments and journeys, connecting them with the brands they value and the media they consume. With this identity-based approach, brands gain more accurate and authentic insight so they can create more meaningful interactions. Joining DISQO Nation means becoming part of a community that champions speed, innovation, and continuous growth. We invest deeply in our talent, empowering our teams to reach their highest potential. Together, we are shaping the future of work at DISQO—defined by performance, purpose, and impact. We show up each day with curiosity and ambition, committed to learning, accelerating growth, and making a lasting difference. Grounded in our values and principles, we lead and collaborate to elevate performance, accountability, and excellence at every level of the organization. And through it all, we make sure to have fun along the way. This is a great opportunity to join a fun, highly motivated team and lead the development of intelligent data products that directly power how brands measure advertising effectiveness. At DISQO, we use modern cloud infrastructure, Generative AI, and expert-level data engineering to solve complex, real-world problems at scale. We are looking for a visionary technical leader who is a master of distributed data processing (Scala/Spark) and passionate about the intersection of data engineering and Artificial Intelligence. You’ll serve as a force multiplier, working closely with engineering leadership, product managers, and analysts in a collaborative environment where rapid innovation and systemic impact matter. We believe the best software is built by highly aligned, autonomous teams that take ownership and move qu
Applying for this Staff Data Engineer (Scala, Spark, & Gen AI) role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Lever
- Lever uses a streamlined one-page form — apply in under 5 minutes.
- LinkedIn import works well; review parsed data before submitting.
- The cover letter field is optional but visible to reviewers — use it to differentiate.
- Referral codes from employees can significantly boost visibility of your application.
ANONYMOUS · UNFILTERED
What do employees actually say about DISQO?
Real rants from real employees. Read before you apply.