Yes Energy
Electric Power Data and Analytics
SiteReliabilityEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Site Reliability Engineer at Yes Energy. Skills: Site Reliability Engineering, Incident Management, Cloud Operations. Respond to pages. Lead incident response”
What You'll Achieve.
operational excellence; incident response; systems availability; monitoring and alerting; release support; reliability improvements across our production services; reduces repeat incidents; prevents similar future alerts; improves overall service reliability; issues are detected quickly; responders have useful context; reliability, scalability, and availability; improve operational readiness; improve production support models; improve reliability practices; growth of a stronger site reliability function
Industry & Context.
solving tough; diagnose production issues; diagnose and resolve availability and performance issues; diagnosing and fixing Jenkins jobs, CI/CD pipelines, deployment failures, environment issues, and release blockers
take ownership of active incidents, respond to pages, senior individual contributor and team-lead role
What They're Looking For.
Must Have
Bachelor's or Master's degree in Computer Science, Information Technology, or a related or equivalent practical experience, Minimum of five years of experience supporting mission-critical production infrastructure, SaaS platforms, web applications, or service-oriented systems, Deep hands-on AWS experience, including production operations for compute, networking, IAM, storage, load balancing, monitoring, Proven incident management experience, including responding to pages, leading high-severity incidents, coordinating responders, writing postmortems and RCA, and driving corrective actions, Experience with containers and Kubernetes, monitoring and alerting systems, CI/CD tooling such as Jenkins and Bitbucket, and operational automation or scripting, Linux and Windows systems administration and troubleshooting experience in production environments
Nice to Have
greater depth is strongly valued
What You'll Do.
Lead incident response
Drive root-cause remediation
Reduce repeat incidents
Prevent similar future alerts
Improve overall service reliability
Serve as incident owner
Coordinate cross-functional responders
Make clear decisions under pressure
Restore service quickly
Build and improve monitoring
Build and improve alerting
Build and improve dashboards
Build and improve SLOs
Build and improve runbooks
Build and improve escalation processes
Operate and troubleshoot Linux systems
Operate and troubleshoot Windows systems
Support production web applications
Support Kubernetes workloads
Work with load balancers
Work with forward proxies
Work with reverse proxies
Work with security groups
Work with traffic-routing patterns
Unblock engineering teams
Diagnose and fix Jenkins jobs
Diagnose and fix CI/CD pipelines
Diagnose and fix deployment failures
Diagnose and fix environment issues
Diagnose and fix release blockers
Partner with Engineering teams
Partner with Security teams
Partner with DBA teams
Partner with Product Technology Services teams
Improve operational readiness
Improve production support models
Improve reliability practices
Mentor SRE team members
Mentor Systems team members
Establish practical standards
Lead growth of site reliability function
How You'll Work.
Team & Collaboration
Coordinate response across engineering teams; Coordinate cross-functional responders; Partner with Engineering, Security, DBA, and Product Technology Services teams; work in small teams on well-defined projects; play to the strengths and experience of each person; work along a continuum of roles adjacent to our focus area
Communication Scope
driving clear communication through resolution; provide technical leadership; delegate effectively
Full Job Description
Join the Market Leader in Electric Power Data and Analytics Solutions The electrical grid is the largest and most complicated machine ever built. Yes Energy’s industry-leading electric power trading analytics software provides real-time visibility into the massive amount of data generated by the North American electrical grid daily. Our unique and innovative view of the data informs real-time trading decisions and mid-to-long-term investment decisions that keep utility prices low, support the energy transition, and keep the grid running. It’s both challenging work and work with a purpose. Be a part of our successful, growing business during international transformation. Position Summary We are hiring a Site Reliability Engineer to serve as a senior, hands-on reliability leader across all product lines. This role sits within the Systems Administration team, part of the Product Technology Services (PTS) group, and is focused squarely on operational excellence: incident response, systems availability, monitoring and alerting, release support, and reliability improvements across our production services. During your working hours, you will be expected to take ownership of active incidents: respond to pages, coordinate response across engineering teams, diagnose production issues, restore service quickly, and drive clear communication through resolution. Incident response and operational readiness are central to the role, not occasional side responsibilities. This is a senior individual contributor and team-lead role responsible for setting SRE standards, mentoring additional SREs as the function grows, unblocking engineering teams, and improving the systems, pipelines, and practices that keep Yes Energy products reliable at scale. Position Details Salary Range: Net 14.000 – 18.000 RON/month Location: Hybrid (Bucharest, Romania) Schedule: Full-time; 2-3 days in the office Reporting to: Manager of Systems Administration Primary Responsibilities Respond to pages across all
Applying for this Site Reliability Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about Yes Energy?
Real rants from real employees. Read before you apply.