Nebius

cloud infrastructure

CustomerSupportEngineer

Remote - Europe Remote Friendly

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Customer Support Engineer at Nebius. Skills: Customer Support Engineering, Cloud Infrastructure, Linux, Kubernetes, AI/ML Workloads, Troubleshooting, Debugging. Investigate and resolve complex technical issues in customer environments. Troubleshoot across Linux, Kubernetes, cloud infrastructure, networking, storage, and GPU-related workloads”

What You'll Achieve.

Help make support more scalable through better automation, observability, and process improvements; Reduce repeated operational pain, not just close cases

Industry & Context.

cloud infrastructure

Problems you'll solve

Handle difficult technical issues; Investigate and resolve complex technical issues; Troubleshoot across Linux, Kubernetes, cloud infrastructure, networking, storage, and GPU-related workloads; Reproduce issues, narrow down root causes; Ability to work independently and stay effective when the path to resolution is not obvious; Enjoys debugging messy infrastructure problems; Thinks beyond the immediate ticket

Eligibility Requirements

Weekend rotation, Incident response, Authorized to work in the country in which they apply, Provide proof of employment eligibility as a condition of hire

What They're Looking For.

Must Have

Linux troubleshooting skills, Kubernetes and container experience, Solid understanding of cloud infrastructure in AWS, GCP, Azure, OpenStack, or similar environments, Good networking fundamentals, Ability to write scripts or small tools in Python, Bash, Go, or similar, Experience working on production issues that require structured debugging and cross-team collaboration, Ability to work independently and stay effective when the path to resolution is not obvious, Clear written communication, especially when explaining technical issues to customers and internal teams

Nice to Have

Experience with GPU-based infrastructure, Familiarity with AI/ML or LLM-related workloads, Understanding of inference and training pipelines, Experience improving observability, tooling, or operational workflows, History of building useful internal tools or automating repetitive work, Personal or open-source projects that show real technical depth

What You'll Do.

Investigate and resolve complex technical issues in customer environments

Troubleshoot across Linux

and GPU-related workloads

Support customers running containerized systems

or other distributed platforms

Act as a senior escalation point for production incidents

narrow down root causes

and work with engineering on long-term fixes

Build or improve internal scripts

troubleshooting tools

and operational documentation

Help make support more scalable through better automation

and process improvements

Communicate clearly with customers during active investigations and incidents

Take part in weekend coverage and urgent issue response

How You'll Work.

Team & Collaboration

Work closely with engineering on production issues; Experience working on production issues that require structured debugging and cross-team collaboration; Works well with engineering

Communication Scope

Communicate clearly with customers during active investigations and incidents; Clear written communication, especially when explaining technical issues to customers and internal teams

Full Job Description

About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R& D. The role We're looking for a senior support engineer who can handle difficult technical issues in modern cloud environments. This is not a traditional support role. The work is hands-on and technical: debugging Linux and Kubernetes issues, investigating problems in cloud infrastructure, and helping customers running AI workloads, distributed systems, and GPU-based environments. You'll work closely with engineering on production issues, help improve internal tools and troubleshooting workflows, and act as an escalation point when problems are unclear or high impact. The role includes weekend rotation and incident response. What you'll do - Investigate and resolve complex technical issues in customer environments - Troubleshoot across Linux, Kubernetes, cloud infrastructure, networking, storage, and GPU-related workloads - Support customers running containerized systems, inference workloads, training jobs, or other distributed platforms - Act as a senior escalation point for production incidents - Reproduce issues, narrow down root causes, and work with engineering on long-term fixes - Build or improve internal scripts, troubleshooting tools, and operational documentation - Help make support more scalable through better automati

Free ATS check

Applying for this Customer Support Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

Should you apply? AI reads your resume vs this job — match score, gaps to address, ATS keywords.

SKILL SIGNAL 51 detected · ranked by frequency

Linux ×4

Kubernetes ×4

Cloud Infrastructure ×3

Debugging Linux ×3

Debugging Kubernetes ×3

Troubleshooting cloud infrastructure ×3

Troubleshooting networking ×3

Troubleshooting storage ×3

Troubleshooting GPU-related workloads ×3

Containerization ×3

Inference ×3

Training ×3

Scripting in Python, Bash, Go ×3

Cloud infrastructure management (AWS, GCP, Azure, OpenStack) ×3

AI/ML workload support ×3

LLM workload support ×3

Customer Support Engineering ×2

AI/ML Workloads ×2

Troubleshooting ×2

Debugging ×2

Python ×2

Bash ×2

Go ×2

AWS ×2

GCP ×2

Azure ×2

OpenStack ×2

networking

storage

GPU-related workloads

containerized systems

inference workloads

BEHAVIOURAL

Ability to work independentlystay effective when the path to resolution is not obviousClear written communicationtakes ownership without waiting to be told exactly what to dothinks beyond the immediate ticketworks well with engineeringlooks for ways to reduce repeated operational pain, not just close cases

Role Details

Experience 5–10 yrs

Level Senior

Work Mode Remote

Category customer-support

AI-Extracted Insights

Domain Areas

ai-economyfull-stack-ai-cloud-platformlarge-scale-gpu-orchestrationinference-optimizationcomputestoragenetworkingapplied-ai

ANONYMOUS · UNFILTERED

What do employees actually say about Nebius?

Real rants from real employees. Read before you apply.

Read Company Rants →