Company

Technology

SoftwareEngineer,PlatformEngineering(Observability)

₹25–45L ~AI est. Bengaluru, India FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Software Engineer, Platform Engineering (Observability). Skills: Observability platform, Metrics, Logging, Tracing, Alerting. Design observability platform. Build observability platform”

What You'll Achieve.

Improve MTTD; Improve MTTM; Reduce alert noise; Reduce engineering toil

Industry & Context.

Technology

Problems you'll solve

Incident detection; Incident resolution; Anomaly detection; Alert correlation

What They're Looking For.

Must Have

6+ years of experience, Observability tools expertise, Kubernetes experience, Go or Python proficiency, Cloud platforms experience, Infrastructure as Code experience, Metrics, logs, traces understanding, Alerting systems design, SLIs, SLOs, error budgets understanding, Developer-facing tools building

Nice to Have

AI-driven observability experience, Large-scale microservices experience

What You'll Do.

Design observability platform

Build observability platform

Maintain observability platform

Develop tools and pipelines

Improve signal quality

Enable faster incident detection

Enable faster incident resolution

Build observability systems

Operate observability systems

Ensure high reliability

Ensure high performance

Ensure high efficiency

Design monitoring workflows

Enhance monitoring workflows

Design alerting workflows

Enhance alerting workflows

Design incident response workflows

Enhance incident response workflows

Develop AI-assisted capabilities

Create self-service tooling

Enable engineering teams

Define observability standards

Enforce observability standards

Define best practices

Enforce best practices

Collaborate with SRE teams

Collaborate with platform teams

Collaborate with security teams

Collaborate with product teams

Ensure full system visibility

Automate operational tasks

Reduce engineering toil

How You'll Work.

Team & Collaboration

SRE teams; Platform teams; Security teams; Product engineering teams

Communication Scope

Technical documentation; Engineering discussions

Full Job Description

## Accountabilities You will be responsible for designing, building, and maintaining a scalable observability platform that spans metrics, logging, tracing, and alerting across large distributed systems. You will develop tools and pipelines that improve signal quality, reduce alert noise, and enable faster incident detection and resolution. Build and operate observability systems at scale, ensuring high reliability, performance, and efficiency across production environments Design and enhance monitoring, alerting, and incident response workflows to improve MTTD and MTTM Develop AI-assisted capabilities for anomaly detection, alert correlation, and automated incident support Create self-service tooling that enables engineering teams to instrument and monitor their own services Define and enforce observability standards, SLIs/SLOs, and best practices across microservices architectures Collaborate with SRE, platform, security, and product engineering teams to ensure full system visibility Automate operational tasks and reduce engineering toil through platform improvements and tooling Requirements This role requires strong experience in building and operating large-scale production systems with a focus on observability, reliability, and cloud-native infrastructure. You should be comfortable working across distributed systems and driving technical improvements in complex environments. 6+ years of experience in software or platform engineering roles focused on scalable production systems Strong hands-on expertise with observability tools such as Datadog, Prometheus, Grafana, or similar Experience working with Kubernetes and containerized production workloads Proficiency in Go or Python for infrastructure, tooling, or backend development Solid experience with cloud platforms such as AWS and/or GCP and Infrastructure as Code (Terraform) Strong understanding of metrics, logs, traces, and distributed system instrumentation Experience designing alerting systems with a focus on

Free ATS check

Applying for this Software Engineer, Platform Engineering (Observability) role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Lever

Lever uses a streamlined one-page form — apply in under 5 minutes.
LinkedIn import works well; review parsed data before submitting.
The cover letter field is optional but visible to reviewers — use it to differentiate.
Referral codes from employees can significantly boost visibility of your application.

ANONYMOUS · UNFILTERED

What do employees actually say about this company?

Real rants from real employees. Read before you apply.

Read Company Rants →