Nscale
Technology
InfrastructureOperationsEngineer
Neural analysis suggests this role is
optimal for Mid+ candidates.
“Infrastructure Operations Engineer at Nscale. Skills: Infrastructure Operations, Data centre, AI Cloud, DevOps. Handle day-to-day tickets. Handle alerts”
What You'll Achieve.
Achieve superior results; Reduce complexity; Manage costs; Drive innovation; Ensure efficiency; Ensure reliability; Ensure scalability; Get value from services; Optimize processes
Industry & Context.
Problem solving; Troubleshooting; Root cause analysis
Support duty rotation, On-call rotation, Out-of-hours work, Availability to travel, Attendance of training courses
What They're Looking For.
Must Have
Comfortable problem solving, Making decisions on complex topics, Grasp technical concepts quickly, Analytical skills, Extremely organized, Diligence, Curious, Quick to learn, Platform and DC fundamentals, Linux fundamentals, Comfortable with CLI, Troubleshoot common issues, Networking basics, IP addressing, Subnets, VLANs, Routing at high level, DNS, Firewalls, Kubernetes core concepts, Basic troubleshooting, Follow runbooks, GPU awareness, Basic diagnostics, Observability foundations, Use dashboards and alerts, Gather evidence, Scripting and automation basics, Reading and writing simple Bash, Writing simple Python snippets, Using Git for version control, Cloud and virtualization basics, Familiarity with hypervisor flows, Familiarity with cloud troubleshooting flows
Nice to Have
Cluster-level administration experience, Advanced networking topics, BGP, VXLAN, Hands-on Kubernetes administration, Operators, Storage add-ons, Networking add-ons, Deeper GPU/HPC concepts, RDMA/InfiniBand, Performant distributed workload basics, Job schedulers, Used NCCL for performance troubleshooting, Infrastructure as Code, Config management tools, GitOps, CI/CD participation, Contributing to pipelines, Modernizing scripts, Experience with access and security tooling, Teleport, Vault, Progress toward relevant certifications
What You'll Do.
Handle day-to-day tickets
Escalate early and appropriately
Collaborate with Engineering
Keep parties informed
Follow established runbooks
Resolve common issues
Contribute incremental fixes
Keep tickets up to date
Communicate with customers
Learn platform fundamentals
Help customers get value
Participate in monitoring
Enable efficient handover
Deliver assigned tasks
Seek help when needed
Document validated steps
Contribute to training materials
Take part in incident reviews
Track preventative follow-ups
Identify automation opportunities
Collaborate with cross-functional teams
Participate in on-call
Participate in out-of-hours work
Travel to Nscale locations
Travel to Customer locations
Assist with deployments
Assist with troubleshooting
Assist with operational tasks
Attend supplier training
How You'll Work.
Team & Collaboration
Cross-functional teams; Senior stakeholders; Onsite operations staff
Communication Scope
Customer communications
Process & Methodology
Project work
Full Job Description
. About Nscale Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility. We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future. About the Role: We’re looking for an Engineer that has good people, leadership & technical skills. A technical expert responsible for ensuring the efficiency, reliability, and scalability of data centre infrastructure. You're comfortable problem solving & making decisions on complex topics with high levels of ambiguity in a results driven environment. You’re comfortable influencing without authority and exceptional at building relationships with senior stakeholders across the business to get things done. You have the understanding and skillset to grasp technical concepts and problems quickly You have strong analytical skills You’re a doer who is extremely organized and diligent You’re a self starter, curious, and quick to learn, knowing what questions to ask to get up to speed quickly What You’ll Be Doing: Join the Support duty rotation and handle day‑to‑day tickets and alerts, escalating early and appropriately. Collaborate with Engineering with guidance when incidents or changes require it. Accurately record, update, manage and resolve tickets using the ticketing system whilst keeping all parties informed of the tickets progression. Follow
Applying for this Infrastructure Operations Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about Nscale?
Real rants from real employees. Read before you apply.