Company
Technology
SoftwareEngineer,PlatformEngineering(Observability)
Neural analysis suggests this role is
optimal for Senior candidates.
“Software Engineer, Platform Engineering (Observability). Skills: Observability platform, Metrics, Logging, Tracing, Alerting. Design observability platform. Build observability platform”
What You'll Achieve.
Improve MTTD; Improve MTTM; Reduce alert noise; Reduce engineering toil
Industry & Context.
Incident detection; Incident resolution; Anomaly detection; Alert correlation
What They're Looking For.
Must Have
6+ years of experience, Observability tools expertise, Kubernetes experience, Go or Python proficiency, Cloud platforms experience, Infrastructure as Code experience, Metrics, logs, traces understanding, Alerting systems design, SLIs, SLOs, error budgets understanding, Developer-facing tools building
Nice to Have
AI-driven observability experience, Large-scale microservices experience
What You'll Do.
Design observability platform
Build observability platform
Maintain observability platform
Develop tools and pipelines
Improve signal quality
Enable faster incident detection
Enable faster incident resolution
Build observability systems
Operate observability systems
Ensure high reliability
Ensure high performance
Ensure high efficiency
Design monitoring workflows
Enhance monitoring workflows
Design alerting workflows
Enhance alerting workflows
Design incident response workflows
Enhance incident response workflows
Develop AI-assisted capabilities
Create self-service tooling
Enable engineering teams
Define observability standards
Enforce observability standards
Define best practices
Enforce best practices
Collaborate with SRE teams
Collaborate with platform teams
Collaborate with security teams
Collaborate with product teams
Ensure full system visibility
Automate operational tasks
Reduce engineering toil
How You'll Work.
Team & Collaboration
SRE teams; Platform teams; Security teams; Product engineering teams
Communication Scope
Technical documentation; Engineering discussions
Full Job Description
## Accountabilities You will be responsible for designing, building, and maintaining a scalable observability platform that spans metrics, logging, tracing, and alerting across large distributed systems. You will develop tools and pipelines that improve signal quality, reduce alert noise, and enable faster incident detection and resolution. Build and operate observability systems at scale, ensuring high reliability, performance, and efficiency across production environments Design and enhance monitoring, alerting, and incident response workflows to improve MTTD and MTTM Develop AI-assisted capabilities for anomaly detection, alert correlation, and automated incident support Create self-service tooling that enables engineering teams to instrument and monitor their own services Define and enforce observability standards, SLIs/SLOs, and best practices across microservices architectures Collaborate with SRE, platform, security, and product engineering teams to ensure full system visibility Automate operational tasks and reduce engineering toil through platform improvements and tooling Requirements This role requires strong experience in building and operating large-scale production systems with a focus on observability, reliability, and cloud-native infrastructure. You should be comfortable working across distributed systems and driving technical improvements in complex environments. 6+ years of experience in software or platform engineering roles focused on scalable production systems Strong hands-on expertise with observability tools such as Datadog, Prometheus, Grafana, or similar Experience working with Kubernetes and containerized production workloads Proficiency in Go or Python for infrastructure, tooling, or backend development Solid experience with cloud platforms such as AWS and/or GCP and Infrastructure as Code (Terraform) Strong understanding of metrics, logs, traces, and distributed system instrumentation Experience designing alerting systems with a focus on
Applying for this Software Engineer, Platform Engineering (Observability) role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Lever
- Lever uses a streamlined one-page form — apply in under 5 minutes.
- LinkedIn import works well; review parsed data before submitting.
- The cover letter field is optional but visible to reviewers — use it to differentiate.
- Referral codes from employees can significantly boost visibility of your application.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.