Nscale
Technology
PrincipalDeploymentEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Principal Deployment Engineer at Nscale. Skills: GPU Infrastructure, Cluster Bringup, High-speed Networking, Automation. Execute GPU node bringup. Execute rack bringup”
What You'll Achieve.
Bring clusters online quickly; Bring clusters online correctly; Meet performance baselines; Exceed performance baselines; Make deployment processes faster; Make deployment processes reliable; Build foundation for growth
Industry & Context.
Troubleshoot hardware issues; Troubleshoot firmware issues; Troubleshoot fabric issues; Identify reliability issues
Travel Required, Onsite work
What They're Looking For.
Must Have
7-8+ years infrastructure engineering, 7-8+ years hardware deployment, 7-8+ years data center operations, Hands-on GPU servers deployment, High-speed networking experience, Linux systems knowledge, Troubleshoot distributed systems performance
Nice to Have
AI/ML infrastructure experience, HPC environments experience, NCCL familiarity, CUDA familiarity, RDMA familiarity, Automation experience, High-density power environments, High-density cooling environments
What You'll Do.
Execute GPU node bringup
Validate BIOS/MC/firmware
Perform rack integration
Validate power cabling
Bring up network fabrics
Validate network fabrics
Configure network connectivity
Validate network connectivity
Run cluster burn-in testing
Run cluster stress testing
Validate GPU-to-GPU performance
Validate node-to-node performance
Troubleshoot hardware issues
Troubleshoot firmware issues
Troubleshoot fabric issues
Contribute to automation
Improve deployment playbooks
Improve documentation
Identify reliability issues
Drive corrective actions
Turn ad hoc into systems
Work with networking teams
Work with systems software teams
Work with data center teams
Coordinate with hardware vendors
Resolve bringup issues
Support capacity expansion
How You'll Work.
Team & Collaboration
Networking teams; Systems software teams; Data center teams; Hardware vendors
Full Job Description
. Principal Deployment Engineer – GPU Infrastructure Bringup Location: United States (Travel Required) Team: Infrastructure Reports to: Head of Infrastructure About Us We are building next-generation AI infrastructure from the ground up. Our mission is to deliver highly performant, reliable, and scalable GPU clusters purpose-built for large-scale AI training and inference. As a startup, we operate with urgency, ownership, and a bias toward action. We are assembling the foundational infrastructure that will power frontier AI workloads—and we’re looking for engineers who want to build it from zero to scale. The Role We are hiring a Principal Deployment Engineer to lead hands-on bringup of GPU clusters across our data center environments. You will own the execution of node, rack, and network deployment, ensuring clusters are validated, performant, and production-ready. This role is deeply technical and execution-focused. You will be in the details—cabling racks, validating firmware, tuning fabrics, debugging performance—and helping us build repeatable processes as we scale. What You’ll Do Cluster Deployment & Bringup Execute end-to-end bringup of GPU nodes and racks from installation to production readiness. Validate BIOS/BMC/firmware configurations and GPU health. Perform rack-level integration including power, cabling, and airflow validation. Bring up and validate high-speed network fabrics (InfiniBand, RoCE, 100–400G Ethernet). Network & Performance Validation Configure and validate leaf/spine network connectivity. Run cluster-wide burn-in and stress testing. Validate GPU-to-GPU and node-to-node performance (NCCL, RDMA, GPUDirect). Troubleshoot hardware, firmware, and fabric-level issues. Automation & Process Contribute to automation for provisioning and cluster validation. Improve deployment playbooks and documentation. Identify reliability issues early and drive corrective actions. Help turn ad hoc deployments into repeatable systems. Cross-Functional Collaboratio
Applying for this Principal Deployment Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about Nscale?
Real rants from real employees. Read before you apply.