CoreWeave
Technology
SeniorEngineer,NetworkObservability
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Engineer, Network Observability at CoreWeave. Skills: Network Observability, Telemetry Solutions, Python, Golang. Design monitoring systems. Develop monitoring systems”
Industry & Context.
Troubleshooting; Anomaly detection
On-call schedule
What They're Looking For.
Must Have
Deep familiarity with Prometheus, Deep familiarity with Grafana, Deep familiarity with Alertmanager, Deep familiarity with gNMI, Deep familiarity with SNMP, Experience as Network Engineer, Experience as SRE, Experience as Software Developer, Experience as Systems Administrator, Proficient with Python, Proficient with Go, Proficient with Bash, Knowledge of Linux systems, Knowledge of IP networking, Hands-on routing experience, Hands-on switching experience, Hands-on network troubleshooting experience, Practical knowledge Arista EOS, Practical knowledge NVIDIA Cumulus Linux, Practical knowledge Nokia SR OS, Practical knowledge SR Linux, Collaborative, Humble, Ready to help others, Open to learning
Nice to Have
Experience writing custom metric collectors, Experience extending custom metric collectors, Track record building telemetry solutions, Track record operating telemetry solutions, Track record building monitoring solutions, Track record operating monitoring solutions, Passion for automating tasks, Passion for automating processes, Comfortable containerizing solutions in Kubernetes, Familiarity with configuration management tools, Familiarity with templating tools, Machine Learning for Anomaly Detection, Network Certifications
What You'll Do.
Design monitoring systems
Develop monitoring systems
Maintain monitoring systems
Design telemetry systems
Develop telemetry systems
Maintain telemetry systems
Design observability systems
Develop observability systems
Maintain observability systems
Build solutions for insights
Detect issues proactively
Resolve issues quickly
Develop observability platforms
Optimize observability platforms
Maintain observability platforms
Design telemetry solutions
Implement telemetry solutions
Ensure advanced alerting
Ensure anomaly detection
Integrate observability solutions
Participate in design discussions
Participate in architectural decisions
Join on-call schedule
Troubleshoot observability issues
Resolve observability issues
Provide timely support
Guide junior team members
Foster continuous learning
How You'll Work.
Team & Collaboration
Network Engineering teams; Platform teams; Network developers; Site reliability engineers; Security teams; Junior team members
Communication Scope
Technical expertise
Process & Methodology
RFCs
Full Job Description
CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at www.coreweave.com. We're proud to be a Living Wage accredited Employer. What You’ll Do We’re seeking a talented and experienced Senior Engineer for Network Observability to join our Network Observability team. In this role, you will be a key player in designing, developing, and maintaining the monitoring, telemetry, and observability systems that keep CoreWeave’s GPU cloud network operating reliably and at scale. You’ll focus on building solutions that provide real-time insights into network performance, ensuring that issues are detected proactively and resolved quickly. Your mission? To empower CoreWeave’s network with advanced observability: robust metrics, powerful analytics, and automated alerting—so well-tuned that any anomalies become clear before they ever impact our customers. Develop, optimize, and maintain network observability platforms. Use your skills in Python and Golang to create and automate collectors, exporters, and dashboards that provide deep visibility into network health and performance. Collaborate with Network Engineering and Platform teams to ingest and unify logs, metrics, and events from a variety of platforms (Arista EOS, NVIDIA Cumulus Linux, Nokia SR OS, SR Linux, etc.) into a single observability pipeline. Design and implement scalable telemetry solutions using protocols like gNMI, SNMP, and streaming analytics. Ensure advanced alerting and anomaly detection with tools such as Prometheus, Grafana, and Alertmanager. Work closely with
Applying for this Senior Engineer, Network Observability role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about CoreWeave?
Real rants from real employees. Read before you apply.