Company

healthcare

AssociateDataEngineer

Chennai, India; Bangalore, India FULL TIME

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“Associate Data Engineer. Skills: PySpark, SQL, Databricks, Airflow, AWS, data quality engineering, ETL orchestration. Build and operate large-scale healthcare data pipelines across batch workflows, metadata-driven ingestion, and data service publishing.. Own end-to-end engineering from source ingestion to conformed data products, with focus on reliability, data quality, and operational observability.”

What You'll Achieve.

deliver trusted datasets; focus on reliability, data quality, and operational observability; SLA-driven delivery

Industry & Context.

healthcare

Problems you'll solve

troubleshooting; failure handling; pipeline performance optimization

What They're Looking For.

Must Have

Bachelor’s degree in Computer Science, Information Technology, or a related field, 2-6 years of experience, Advanced Python, Advanced PySpark, Advanced SQL (window functions, complex joins, MERGE patterns, optimization), Hands-on Databricks experience in enterprise environments, Hands-on Airflow experience in enterprise environments, Experience with cloud data platforms (AWS), Experience with object storage, Experience with secure secret handling, data quality engineering, monitoring, troubleshooting in regulated data contexts, Solid understanding of ETL orchestration, Solid understanding of dependency management, Solid understanding of SLA-driven delivery

What You'll Do.

Build and operate large-scale healthcare data pipelines across batch workflows

metadata-driven ingestion

and data service publishing.

Own end-to-end engineering from source ingestion to conformed data products

with focus on reliability

and operational observability.

Design and maintain PySpark/SQL pipelines in Databricks for landing

and published data layers.

Build and support Airflow DAGs for scheduling

and production operations.

Implement metadata/config-driven frameworks for ingestion

and rule-based processing.

Develop robust data quality controls

and alerting workflows.

Manage batch/process audit logs

and operational reporting.

Integrate multi-source data (files

and relational systems) into governed Delta/Spark tables.

Optimize pipeline performance using partitioning

How You'll Work.

Team & Collaboration

Partner with analytics, business, and platform teams to deliver trusted datasets for sales, claims, activity, patient, and rare disease use cases.; Collaborate on schema evolution, business-rule onboarding, and production support.

Full Job Description

## **Career Category** Information Systems ## ## **Job Description** **Role Summary** * Build and operate large-scale healthcare data pipelines across batch workflows, metadata-driven ingestion, and data service publishing. * Own end-to-end engineering from source ingestion to conformed data products, with strong focus on reliability, data quality, and operational observability. * Partner with analytics, business, and platform teams to deliver trusted datasets for sales, claims, activity, patient, and rare disease use cases. **Key Responsibilities** * Design and maintain PySpark/SQL pipelines in Databricks for landing, unified, unstitched, and published data layers. * Build and support Airflow DAGs for scheduling, dependencies, retries, and production operations. * Implement metadata/config-driven frameworks for ingestion, transformation, and rule-based processing. * Develop robust data quality controls, DQ summaries, failure handling, and alerting workflows. * Manage batch/process audit logs, run status tracking, release flags, and operational reporting. * Integrate multi-source data (files, APIs, cloud storage, and relational systems) into governed Delta/Spark tables. * Optimize pipeline performance using partitioning, parallelization, and query tuning. * Collaborate on schema evolution, business-rule onboarding, and production support. **Required Skills** * Bachelor’s degree in Computer Science, Information Technology, or a related field with 2-6 years of experience * Advanced Python, PySpark, and SQL (window functions, complex joins, MERGE patterns, optimization). * Hands-on Databricks and Airflow experience in enterprise environments. * Experience with cloud data platforms (AWS), object storage, and secure secret handling. * Strong data quality engineering, monitoring, and troubleshooting in regulated data contexts. * Solid understanding of ETL orchestration, dependency management, and SLA-driven delivery. .

Free ATS check

Applying for this Associate Data Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 46 detected · ranked by frequency

ETL orchestration ×6

data quality engineering ×5

Databricks ×4

Airflow ×4

dependency management ×4

SLA-driven delivery ×4

PySpark ×3

SQL ×3

AWS ×3

Build and operate large-scale healthcare data pipelines ×3

Own end-to-end engineering from source ingestion to conformed data products ×3

Design and maintain PySpark/SQL pipelines ×3

Build and support Airflow DAGs ×3

Implement metadata/config-driven frameworks ×3

Develop robust data quality controls ×3

Manage batch/process audit logs ×3

Integrate multi-source data ×3

Optimize pipeline performance ×3

monitoring ×3

troubleshooting ×3

Delta Lake

Spark

metadata-driven ingestion

data service publishing

reliability

data quality

operational observability

trusted datasets

metadata/config-driven frameworks

rule-based processing

data quality controls

DQ summaries

BEHAVIOURAL

Partner with analytics, business, and platform teamsCollaborate on schema evolution, business-rule onboarding, and production support

Role Details

Seniority mid

Experience 2–6 yrs

Level Mid

Work Mode No

Type FULL TIME

Education Bachelor’s degree in Computer Science, Information Technolog

AI-Extracted Insights

Domain Areas

healthcare-data-pipelinessales-use-casesclaims-use-casesactivity-use-casespatient-use-casesrare-disease-use-casesregulated-data-contexts

How to Apply on Workday

Workday has a multi-step form — save your progress after every section.
"Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about this company?

Real rants from real employees. Read before you apply.

Read Company Rants →