Yes Energy

Electric Power Data and Analytics

SiteReliabilityEngineer

$14–18k Bucharest, Romania FULL TIME Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Site Reliability Engineer at Yes Energy. Skills: Site Reliability Engineering, Incident Management, Cloud Operations. Respond to pages. Lead incident response”

What You'll Achieve.

operational excellence; incident response; systems availability; monitoring and alerting; release support; reliability improvements across our production services; reduces repeat incidents; prevents similar future alerts; improves overall service reliability; issues are detected quickly; responders have useful context; reliability, scalability, and availability; improve operational readiness; improve production support models; improve reliability practices; growth of a stronger site reliability function

Industry & Context.

Electric Power Data and Analytics

Problems you'll solve

solving tough; diagnose production issues; diagnose and resolve availability and performance issues; diagnosing and fixing Jenkins jobs, CI/CD pipelines, deployment failures, environment issues, and release blockers

Eligibility Requirements

take ownership of active incidents, respond to pages, senior individual contributor and team-lead role

What They're Looking For.

Must Have

Bachelor's or Master's degree in Computer Science, Information Technology, or a related or equivalent practical experience, Minimum of five years of experience supporting mission-critical production infrastructure, SaaS platforms, web applications, or service-oriented systems, Deep hands-on AWS experience, including production operations for compute, networking, IAM, storage, load balancing, monitoring, Proven incident management experience, including responding to pages, leading high-severity incidents, coordinating responders, writing postmortems and RCA, and driving corrective actions, Experience with containers and Kubernetes, monitoring and alerting systems, CI/CD tooling such as Jenkins and Bitbucket, and operational automation or scripting, Linux and Windows systems administration and troubleshooting experience in production environments

Nice to Have

greater depth is strongly valued

What You'll Do.

Lead incident response

Drive root-cause remediation

Reduce repeat incidents

Prevent similar future alerts

Improve overall service reliability

Serve as incident owner

Coordinate cross-functional responders

Make clear decisions under pressure

Restore service quickly

Build and improve monitoring

Build and improve alerting

Build and improve dashboards

Build and improve SLOs

Build and improve runbooks

Build and improve escalation processes

Operate and troubleshoot Linux systems

Operate and troubleshoot Windows systems

Support production web applications

Support Kubernetes workloads

Work with load balancers

Work with forward proxies

Work with reverse proxies

Work with security groups

Work with traffic-routing patterns

Unblock engineering teams

Diagnose and fix Jenkins jobs

Diagnose and fix CI/CD pipelines

Diagnose and fix deployment failures

Diagnose and fix environment issues

Diagnose and fix release blockers

Partner with Engineering teams

Partner with Security teams

Partner with DBA teams

Partner with Product Technology Services teams

Improve operational readiness

Improve production support models

Improve reliability practices

Mentor SRE team members

Mentor Systems team members

Establish practical standards

Lead growth of site reliability function

How You'll Work.

Team & Collaboration

Coordinate response across engineering teams; Coordinate cross-functional responders; Partner with Engineering, Security, DBA, and Product Technology Services teams; work in small teams on well-defined projects; play to the strengths and experience of each person; work along a continuum of roles adjacent to our focus area

Communication Scope

driving clear communication through resolution; provide technical leadership; delegate effectively

Full Job Description

Join the Market Leader in Electric Power Data and Analytics Solutions The electrical grid is the largest and most complicated machine ever built. Yes Energy’s industry-leading electric power trading analytics software provides real-time visibility into the massive amount of data generated by the North American electrical grid daily. Our unique and innovative view of the data informs real-time trading decisions and mid-to-long-term investment decisions that keep utility prices low, support the energy transition, and keep the grid running. It’s both challenging work and work with a purpose. Be a part of our successful, growing business during international transformation. Position Summary We are hiring a Site Reliability Engineer to serve as a senior, hands-on reliability leader across all product lines. This role sits within the Systems Administration team, part of the Product Technology Services (PTS) group, and is focused squarely on operational excellence: incident response, systems availability, monitoring and alerting, release support, and reliability improvements across our production services. During your working hours, you will be expected to take ownership of active incidents: respond to pages, coordinate response across engineering teams, diagnose production issues, restore service quickly, and drive clear communication through resolution. Incident response and operational readiness are central to the role, not occasional side responsibilities. This is a senior individual contributor and team-lead role responsible for setting SRE standards, mentoring additional SREs as the function grows, unblocking engineering teams, and improving the systems, pipelines, and practices that keep Yes Energy products reliable at scale. Position Details Salary Range: Net 14.000 – 18.000 RON/month Location: Hybrid (Bucharest, Romania) Schedule: Full-time; 2-3 days in the office Reporting to: Manager of Systems Administration Primary Responsibilities Respond to pages across all

Free ATS check

Applying for this Site Reliability Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 41 detected · ranked by frequency

Incident Response ×3

Monitoring ×3

Alerting ×3

Dashboards ×3

SLOs ×3

Runbooks ×3

Escalation Processes ×3

Load Balancers ×3

Proxies ×3

DNS ×3

Networking ×3

Firewalls ×3

Security Groups ×3

Traffic Routing ×3

CI/CD Pipelines ×3

Deployment Failures ×3

Operational Readiness ×3

Production Support ×3

Reliability Practices ×3

Scripting ×3

Automation ×3

Site Reliability Engineering ×2

Incident Management ×2

Cloud Operations ×2

AWS ×2

Azure ×2

OCI ×2

Kubernetes ×2

Jenkins ×2

Bitbucket ×2

Terraform ×2

CloudFormation ×2

Role Details

Experience 5–10 yrs

Level Senior

Work Mode Hybrid

Type FULL TIME

Education Bachelor's or Master's degree in Computer Science, Informati

Category 4300-tf-eng

Salary Band <30k

AI-Extracted Insights

Domain Areas

electric-power-dataanalytics-solutionselectrical-gridenergy-transition

How to Apply on Greenhouse

Create a Greenhouse profile before applying — it saves time across multiple applications.
Upload your resume as a PDF; the parser handles it better than Word.
Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about Yes Energy?

Real rants from real employees. Read before you apply.

Read Company Rants →