Jalasoft

Technology

SeniorDataEngineer-AWS&RAGPipelines

₹35–60L ~AI est. Remote FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Senior Data Engineer - AWS & RAG Pipelines at Jalasoft. Skills: Data Engineering, AWS, RAG Systems, Generative AI. Design cloud data infrastructure. Operate cloud data infrastructure”

Industry & Context.

Technology

What They're Looking For.

Must Have

7+ years Data Engineering, 7+ years Distributed Systems, 7+ years Data Architecture, 4+ years AWS data lakes, 4+ years AWS storage tiers, 4+ years AWS event streaming, 2+ years RAG systems, 2+ years managing embeddings, 2+ years orchestrating foundational models, AWS Data Lake Architecture, AWS storage tiers, AWS event streaming, Real-Time Observability, Log Analytics, Elasticsearch Optimization, OpenSearch Optimization, Vectorization, Embeddings, Amazon Bedrock, Generative AI Pipelines, Software Engineering, API Ingestion, C# (. NET Core), Java, Python, Node.js

Nice to Have

AWS S3 partitioning strategies, AWS S3 lifecycle policies, AWS S3 columnar formats, AWS Glue Data Catalog, AWS Lake Formation, Query optimization petabyte-scale, Amazon Athena, Redshift Spectrum, OTel collector configuration, High-volume streaming logs, Datadog captures, Raw server events, CDC from PostgreSQL, Debezium, AWS DMS, Amazon OpenSearch clusters, Lexical search, High-dimensional vector search, OpenSearch index lifecycle, OpenSearch sharding strategies, OpenSearch dynamic mappings, Amazon Bedrock APIs, Claude, Titan, Data enrichment, Classification, Semantic parsing, Knowledge Bases for Amazon Bedrock, Automatic chunking, Metadata extraction, Vector index syncs, ETL/ELT pipelines, Unstructured event data, SaaS APIs, Pendo, Hotjar, Google Analytics, MCP server development

What You'll Do.

Design cloud data infrastructure

Operate cloud data infrastructure

Architect production-scale data lakes

Build real-time ingestion pipelines

Build observability pipelines

Own vector search layers

Feed autonomous agents

Full Job Description

We're looking for a Senior Data Engineer to design and operate the cloud data infrastructure powering our AI initiatives. You'll architect production-scale data lakes on AWS, build real-time ingestion and observability pipelines, and own the vector search and embedding layers that feed our RAG systems and autonomous agents. **Requirements** **Must-Have** * Overall Experience: 7+ years in Data Engineering, Distributed Systems, or Data Architecture * AWS & Infrastructure: 4+ years architecting production-scale data lakes, storage tiers, and event streaming * AI/LLM Pipelines: 2+ years building RAG systems, managing embeddings, and orchestrating foundational models * Proficiency in AWS Data Lake Architecture & Storage * Proficiency in Real-Time Observability & Log Analytics * Proficiency in Elasticsearch & OpenSearch Optimization, Vectorization, Embeddings * Proficiency in Amazon Bedrock & Generative AI Pipelines * Proficiency in Software Engineering & API Ingestion * Production-level proficiency in one or more of: C# (.NET Core), Java, Python, or Node.js **Preferred Experience** * AWS S3 partitioning strategies, lifecycle policies, and columnar formats (Parquet, Iceberg) * AWS Glue Data Catalog and Lake Formation for multi-tenant, fine-grained access control * Query optimization over petabyte-scale datasets using Amazon Athena and Redshift Spectrum * Distributed oTel collector configuration for log, trace, and metrics capture and routing into S3 * High-volume streaming of system logs, Datadog captures, and raw server events into S3 * Real-time CDC from PostgreSQL using Debezium or AWS DMS * Amazon OpenSearch clusters with simultaneous lexical and high-dimensional vector search * OpenSearch index lifecycle management, sharding strategies, and dynamic mappings at scale * Amazon Bedrock foundational model APIs (Claude, Titan) for data enrichment, classification, and semantic parsing * Knowledge Bases for Amazon Bedrock for automatic chunking, metadata extraction, and vec

Free ATS check

Applying for this Senior Data Engineer - AWS & RAG Pipelines role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 64 detected · ranked by frequency

RAG Systems ×6

Generative AI ×6

Data Engineering ×5

Amazon Bedrock ×5

Software Engineering ×5

API Ingestion ×5

AWS ×4

Data Architecture ×4

Distributed Systems ×4

Data Lakes ×4

Storage Tiers ×4

Event Streaming ×4

Embeddings ×4

Foundational Models ×4

Observability ×4

Log Analytics ×4

Vectorization ×4

Lake Formation ×4

CDC ×4

ETL ×4

ELT ×4

Elasticsearch Optimization ×3

OpenSearch Optimization ×3

S3 partitioning ×3

S3 lifecycle ×3

Columnar formats ×3

Data Catalog ×3

Query optimization ×3

OTel collector ×3

Streaming logs ×3

Streaming metrics ×3

Vector search ×3

Role Details

Seniority senior

Experience 7–10 yrs

Level Senior

Work Mode Remote

Type FULL TIME

Category software

Salary Band 200k+

AI-Extracted Insights

Domain Areas

data-lake-architectureevent-streamingreal-time-observabilitylog-analyticsvectorizationembeddingsgenerative-airag-systems

ANONYMOUS · UNFILTERED

What do employees actually say about Jalasoft?

Real rants from real employees. Read before you apply.

Read Company Rants →