Source Meridian

Healthcare

DataEngineer

$36000–54000k ~AI est. Medellín, Antioquia, Colombia CONTRACT

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Entry candidates.

The Brief

“Data Engineer at Source Meridian. Skills: Data engineering, AWS data stack, Spark pipelines. Build Spark pipelines. Maintain Spark pipelines”

Industry & Context.

Healthcare

Problems you'll solve

Solution-oriented approach

What They're Looking For.

Must Have

1-2 years professional experience, Apache Spark experience, AWS data stack experience, Amazon S3 experience, Amazon Athena experience, Airflow experience, Excellent SQL skills, Solid data modeling fundamentals, Advanced English level

Nice to Have

dbt experience, Healthcare data familiarity, Tokenization experience, Identity resolution experience, Privacy-preserving data workflows experience, AWS security concepts knowledge, Spark on AWS experience, Spark-on-containers experience

What You'll Do.

Build Spark pipelines

Maintain Spark pipelines

Process Parquet datasets

Implement tokenization workflows

Convert token to real token

Process healthcare claims datasets

Ensure identity mapping

Ensure data integrity

Orchestrate data pipelines

Develop ETL/ELT processes

Contribute to dbt models

How You'll Work.

Team & Collaboration

Cross-functional stakeholders

Communication Scope

Technical discussions; Clear documentation; Client-facing experience

Full Job Description

We’re looking for a Data Engineer to join Source Meridian. About Source Meridian Source Meridian is a development software company that works to solve the industry’s most challenging problems in healthcare practices. We are laser focused on specific technologies in the healthcare and life science industries: Healthcare technology, artificial intelligence, and healthcare interoperability. About the Role We're looking for a Data Engineer to help build and operate an AWS-native data platform processing healthcare claims data and tokenized identifiers. You'll design and implement Spark-based pipelines that transform, intersect, and enrich tokenized datasets stored primarily as Parquet on S3, queried via Athena and related AWS services. This environment intentionally avoids managed lakehouse platforms (e.g., no Databricks and no Snowflake)—you'll be doing "real" data engineering directly on AWS. What You’ll Do Build and maintain Spark pipelines to process large-scale Parquet datasets on S3. Implement tokenization workflows, including transit token → real token conversion and dataset intersection/join logic. Process and deliver healthcare claims datasets for matched individuals, ensuring accurate identity mapping and data integrity. Orchestrate data pipelines using Airflow and/or AWS-native orchestration tools when appropriate. Develop reliable, testable, and observable ETL/ELT processes (retries, idempotency, monitoring, reprocessing). Optimize performance and cost across Spark jobs, S3 partitioning/layout, and Athena query patterns. Contribute to dbt models when applicable (transformations, documentation, data quality checks). Collaborate with cross-functional stakeholders in a healthcare environment, with a strong focus on privacy and secure data handling. Required Qualifications 1 -2 years of professional experience in Data Engineering. Strong experience with Apache Spark (PySpark or Scala), including joins, intersections, partitioning, and performance tuning. Strong

Free ATS check

Applying for this Data Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 22 detected · ranked by frequency

Spark ×7

Tokenization workflows ×3

Data transformation ×3

Performance optimization ×3

Cost optimization ×3

Data quality checks ×3

Data engineering ×2

AWS data stack ×2

AWS ×2

Airflow ×2

dbt ×2

Parquet

Athena

Glue Catalog

Lake Formation

SQL

Data modeling

Identity resolution

Data handling

Data integrity

ETL/ELT processes

BEHAVIOURAL

Empathetic leadershipExpectation managementStrategic mindsetDecision-making skills

Role Details

Experience 1–2 yrs

Level Entry

Work Mode Onsite

Type CONTRACT

Category technology

Salary Band 200k+

AI-Extracted Insights

Domain Areas

healthcare-datahealthcare-interoperabilityprivacy-preserving-data

How to Apply on Greenhouse

Create a Greenhouse profile before applying — it saves time across multiple applications.
Upload your resume as a PDF; the parser handles it better than Word.
Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about Source Meridian?

Real rants from real employees. Read before you apply.

Read Company Rants →