Company
healthcare
AssociateDataEngineer
Neural analysis suggests this role is
optimal for Mid candidates.
“Associate Data Engineer. Skills: PySpark, SQL, Databricks, Airflow, AWS, data quality engineering, ETL orchestration. Build and operate large-scale healthcare data pipelines across batch workflows, metadata-driven ingestion, and data service publishing.. Own end-to-end engineering from source ingestion to conformed data products, with focus on reliability, data quality, and operational observability.”
What You'll Achieve.
deliver trusted datasets; focus on reliability, data quality, and operational observability; SLA-driven delivery
Industry & Context.
troubleshooting; failure handling; pipeline performance optimization
What They're Looking For.
Must Have
Bachelor’s degree in Computer Science, Information Technology, or a related field, 2-6 years of experience, Advanced Python, Advanced PySpark, Advanced SQL (window functions, complex joins, MERGE patterns, optimization), Hands-on Databricks experience in enterprise environments, Hands-on Airflow experience in enterprise environments, Experience with cloud data platforms (AWS), Experience with object storage, Experience with secure secret handling, data quality engineering, monitoring, troubleshooting in regulated data contexts, Solid understanding of ETL orchestration, Solid understanding of dependency management, Solid understanding of SLA-driven delivery
What You'll Do.
Build and operate large-scale healthcare data pipelines across batch workflows
metadata-driven ingestion
and data service publishing.
Own end-to-end engineering from source ingestion to conformed data products
with focus on reliability
and operational observability.
Design and maintain PySpark/SQL pipelines in Databricks for landing
and published data layers.
Build and support Airflow DAGs for scheduling
and production operations.
Implement metadata/config-driven frameworks for ingestion
and rule-based processing.
Develop robust data quality controls
and alerting workflows.
Manage batch/process audit logs
and operational reporting.
Integrate multi-source data (files
and relational systems) into governed Delta/Spark tables.
Optimize pipeline performance using partitioning
How You'll Work.
Team & Collaboration
Partner with analytics, business, and platform teams to deliver trusted datasets for sales, claims, activity, patient, and rare disease use cases.; Collaborate on schema evolution, business-rule onboarding, and production support.
Full Job Description
## **Career Category** Information Systems ## ## **Job Description** **Role Summary** * Build and operate large-scale healthcare data pipelines across batch workflows, metadata-driven ingestion, and data service publishing. * Own end-to-end engineering from source ingestion to conformed data products, with strong focus on reliability, data quality, and operational observability. * Partner with analytics, business, and platform teams to deliver trusted datasets for sales, claims, activity, patient, and rare disease use cases. **Key Responsibilities** * Design and maintain PySpark/SQL pipelines in Databricks for landing, unified, unstitched, and published data layers. * Build and support Airflow DAGs for scheduling, dependencies, retries, and production operations. * Implement metadata/config-driven frameworks for ingestion, transformation, and rule-based processing. * Develop robust data quality controls, DQ summaries, failure handling, and alerting workflows. * Manage batch/process audit logs, run status tracking, release flags, and operational reporting. * Integrate multi-source data (files, APIs, cloud storage, and relational systems) into governed Delta/Spark tables. * Optimize pipeline performance using partitioning, parallelization, and query tuning. * Collaborate on schema evolution, business-rule onboarding, and production support. **Required Skills** * Bachelor’s degree in Computer Science, Information Technology, or a related field with 2-6 years of experience * Advanced Python, PySpark, and SQL (window functions, complex joins, MERGE patterns, optimization). * Hands-on Databricks and Airflow experience in enterprise environments. * Experience with cloud data platforms (AWS), object storage, and secure secret handling. * Strong data quality engineering, monitoring, and troubleshooting in regulated data contexts. * Solid understanding of ETL orchestration, dependency management, and SLA-driven delivery. .
Applying for this Associate Data Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.