Nebius
cloud infrastructure
CustomerSupportEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Customer Support Engineer at Nebius. Skills: Customer Support Engineering, Cloud Infrastructure, Linux, Kubernetes, AI/ML Workloads, Troubleshooting, Debugging. Investigate and resolve complex technical issues in customer environments. Troubleshoot across Linux, Kubernetes, cloud infrastructure, networking, storage, and GPU-related workloads”
What You'll Achieve.
Help make support more scalable through better automation, observability, and process improvements; Reduce repeated operational pain, not just close cases
Industry & Context.
Handle difficult technical issues; Investigate and resolve complex technical issues; Troubleshoot across Linux, Kubernetes, cloud infrastructure, networking, storage, and GPU-related workloads; Reproduce issues, narrow down root causes; Ability to work independently and stay effective when the path to resolution is not obvious; Enjoys debugging messy infrastructure problems; Thinks beyond the immediate ticket
Weekend rotation, Incident response, Authorized to work in the country in which they apply, Provide proof of employment eligibility as a condition of hire
What They're Looking For.
Must Have
Linux troubleshooting skills, Kubernetes and container experience, Solid understanding of cloud infrastructure in AWS, GCP, Azure, OpenStack, or similar environments, Good networking fundamentals, Ability to write scripts or small tools in Python, Bash, Go, or similar, Experience working on production issues that require structured debugging and cross-team collaboration, Ability to work independently and stay effective when the path to resolution is not obvious, Clear written communication, especially when explaining technical issues to customers and internal teams
Nice to Have
Experience with GPU-based infrastructure, Familiarity with AI/ML or LLM-related workloads, Understanding of inference and training pipelines, Experience improving observability, tooling, or operational workflows, History of building useful internal tools or automating repetitive work, Personal or open-source projects that show real technical depth
What You'll Do.
Investigate and resolve complex technical issues in customer environments
Troubleshoot across Linux
and GPU-related workloads
Support customers running containerized systems
or other distributed platforms
Act as a senior escalation point for production incidents
narrow down root causes
and work with engineering on long-term fixes
Build or improve internal scripts
troubleshooting tools
and operational documentation
Help make support more scalable through better automation
and process improvements
Communicate clearly with customers during active investigations and incidents
Take part in weekend coverage and urgent issue response
How You'll Work.
Team & Collaboration
Work closely with engineering on production issues; Experience working on production issues that require structured debugging and cross-team collaboration; Works well with engineering
Communication Scope
Communicate clearly with customers during active investigations and incidents; Clear written communication, especially when explaining technical issues to customers and internal teams
Full Job Description
About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R& D. The role We're looking for a senior support engineer who can handle difficult technical issues in modern cloud environments. This is not a traditional support role. The work is hands-on and technical: debugging Linux and Kubernetes issues, investigating problems in cloud infrastructure, and helping customers running AI workloads, distributed systems, and GPU-based environments. You'll work closely with engineering on production issues, help improve internal tools and troubleshooting workflows, and act as an escalation point when problems are unclear or high impact. The role includes weekend rotation and incident response. What you'll do - Investigate and resolve complex technical issues in customer environments - Troubleshoot across Linux, Kubernetes, cloud infrastructure, networking, storage, and GPU-related workloads - Support customers running containerized systems, inference workloads, training jobs, or other distributed platforms - Act as a senior escalation point for production incidents - Reproduce issues, narrow down root causes, and work with engineering on long-term fixes - Build or improve internal scripts, troubleshooting tools, and operational documentation - Help make support more scalable through better automati
Applying for this Customer Support Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Nebius?
Real rants from real employees. Read before you apply.