DISQO

Tech / AI / Software

StaffDataEngineer(Scala,Spark,&GenAI)

los angeles, california, united states FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Lead candidates.

The Brief

“Staff Data Engineer (Scala, Spark, & Gen AI) at DISQO. Skills: Scala, Spark, Generative AI, Data Engineering, AWS. Architect and Lead data pipelines. Design, build, and maintain highly scalable, fault-tolerant data pipelines using expert-level Scala and Apache Spark”

Industry & Context.

Tech / AI / Software

Problems you'll solve

solve complex, real-world problems at scale; tackle our hardest scalability challenges; resolve complex performance bottlenecks, memory issues, and data skew

What They're Looking For.

Must Have

8+ years of experience building, architecting, and supporting complex production data pipelines, distributed systems, and backend infrastructure, Deep, hands-on expertise in Scala and Apache Spark, Proven experience integrating Gen AI / LLMs (e.g., OpenAI APIs, Anthropic, Bedrock) into data products or data engineering workflows, Hands on experience developing with AI dev tools such as Claude code, etc, Proficiency in Python specifically to interface with modern AI ecosystems, data APIs, and orchestration tools, Extensive architectural experience within the AWS ecosystem (EMR, Glue, Athena, S3, Bedrock, etc.), Deep understanding of advanced ETL/ELT concepts, complex data modeling, and performance-tuning SQL, Expert-level experience with workflow orchestration tools such as Airflow, Proven track record of leading technical initiatives, making architectural decisions, and mentoring teams in an agile, fast-moving environment

Nice to Have

Experience with Snowflake or other modern cloud data warehouses, Deep exposure to streaming or real-time event processing (Kafka, Flink, Kinesis, etc.), Experience utilizing AI for automated data observability, anomaly detection, or data quality tooling, Background in ad tech, measurement, attribution modeling, or specialized analytics platforms

What You'll Do.

Architect and Lead data pipelines

and maintain highly scalable

fault-tolerant data pipelines using expert-level Scala and Apache Spark

Pioneer the use of Generative AI within our data ecosystem

Incorporate LLMs to enrich datasets

extract value from unstructured data

automate metadata generation

and build intelligent data products

Partner with Product and Engineering leadership to translate complex business requirements into forward-looking data and AI-augmented architectures

Architect and aggressively optimize large-scale ETL/ELT workflows

Dive deep into Spark internals to resolve complex performance bottlenecks

Implement and manage infrastructure to support AI integration

including vector databases

and Retrieval-Augmented Generation (RAG) architectures

and maintainable code

Establish standards for code quality

and system architecture across the organization

Champion data quality

and system health to consistently meet enterprise SLAs and customer commitments

Actively mentor engineers

lead technical design reviews

and foster a culture of continuous learning and technical rigor

How You'll Work.

Team & Collaboration

working closely with engineering leadership, product managers, and analysts in a collaborative environment; Partner with Product and Engineering leadership; lead cross-functional technical initiatives; mentor senior and mid-level engineers; lead technical design reviews

Process & Methodology

agile development practices, leading technical initiatives

Full Job Description

## Description DISQO’s mission is to build the world’s most trusted ad measurement platform that fuels brand growth. The world’s largest brands, agencies, and media companies trust DISQO for expert insight and AI-driven intelligence about their advertising performance across all platforms. We capture people’s sentiments and journeys, connecting them with the brands they value and the media they consume. With this identity-based approach, brands gain more accurate and authentic insight so they can create more meaningful interactions. Joining DISQO Nation means becoming part of a community that champions speed, innovation, and continuous growth. We invest deeply in our talent, empowering our teams to reach their highest potential. Together, we are shaping the future of work at DISQO—defined by performance, purpose, and impact. We show up each day with curiosity and ambition, committed to learning, accelerating growth, and making a lasting difference. Grounded in our values and principles, we lead and collaborate to elevate performance, accountability, and excellence at every level of the organization. And through it all, we make sure to have fun along the way. This is a great opportunity to join a fun, highly motivated team and lead the development of intelligent data products that directly power how brands measure advertising effectiveness. At DISQO, we use modern cloud infrastructure, Generative AI, and expert-level data engineering to solve complex, real-world problems at scale. We are looking for a visionary technical leader who is a master of distributed data processing (Scala/Spark) and passionate about the intersection of data engineering and Artificial Intelligence. You’ll serve as a force multiplier, working closely with engineering leadership, product managers, and analysts in a collaborative environment where rapid innovation and systemic impact matter. We believe the best software is built by highly aligned, autonomous teams that take ownership and move qu

Free ATS check

Applying for this Staff Data Engineer (Scala, Spark, & Gen AI) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 59 detected · ranked by frequency

Scala ×7

AWS ×7

Generative AI ×6

Data Engineering ×5

EMR ×5

Glue ×5

Athena ×5

S3 ×5

Bedrock ×5

Airflow ×5

Kafka ×5

Flink ×5

Kinesis ×5

Spark ×4

Apache Spark ×4

LLMs ×4

Python ×4

vector databases ×4

embeddings pipelines ×4

Retrieval-Augmented Generation (RAG) ×4

distributed data processing ×3

Artificial Intelligence ×3

data pipelines ×3

distributed systems ×3

backend infrastructure ×3

Spark internals ×3

query plans ×3

memory management ×3

performance tuning ×3

batch processing ×3

data APIs ×3

orchestration tools ×3

BEHAVIOURAL

curiosityambitioncollaborationownershipcontinuous learningmentoringtechnical rigor

Role Details

Experience 8–15 yrs

Level Lead

Work Mode flexible hybrid approach

Type FULL TIME

AI-Extracted Insights

Domain Areas

ad-measurementadvertising-performancead-techmeasurementattribution-modelingspecialized-analytics-platforms

How to Apply on Lever

Lever uses a streamlined one-page form — apply in under 5 minutes.
LinkedIn import works well; review parsed data before submitting.
The cover letter field is optional but visible to reviewers — use it to differentiate.
Referral codes from employees can significantly boost visibility of your application.

ANONYMOUS · UNFILTERED

What do employees actually say about DISQO?

Real rants from real employees. Read before you apply.

Read Company Rants →