Nvidia
telecommunications
SeniorMulti‑GPUSignalProcessingandSystemArchitectureEngineer
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Multi‑GPU Signal Processing and System Architecture Engineer at Nvidia. Skills: GPU kernel design, CUDA, real-time system design, multi-device GPU systems, OFDM signal processing, 5G NR. design and implementation of the real‑time signal‑processing subsystem that converts physics‑based channel descriptions into received signals for large numbers of emulated devices, across systems of potentially thousands of interconnected GPUs. design and implement GPU kernels that apply time‑varying, mul”
Industry & Context.
assess design and implementation trade‑offs between physical fidelity, latency, and system scalability
What They're Looking For.
Must Have
PhD in high‑performance computing, computer architecture, signal processing, or wireless communications (or equivalent experience), 12+ years of proven experience, Proficiency in CUDA kernel design with attention to memory hierarchy, register pressure, and HBM bandwidth planning, with a track record of writing production‑quality GPU code that meets hard real‑time deadlines, Demonstrated ability to build and reason about data flows across multi‑device GPU systems (NVLink, NIC/RDMA) with explicit bandwidth and latency accounting, Working knowledge of OFDM signal processing and the 5G NR physical layer, sufficient to implement and validate a channel‑emulation pipeline, Impactful publications involving GPU‑accelerated numerical workloads or real‑time system design
Nice to Have
Experience with GPU‑accelerated RAN platforms, L1/L2 software stacks, or channel emulators, Knowledge of high‑bandwidth GPU interconnects (NVLink, NVSwitch) and their scaling properties, Familiarity with massive MIMO beamformer design and MU‑MIMO precoding
What You'll Do.
design and implementation of the real‑time signal‑processing subsystem that converts physics‑based channel descriptions into received signals for large numbers of emulated devices
across systems of potentially thousands of interconnected GPUs
design and implement GPU kernels that apply time‑varying
multi‑antenna channels to OFDM signals under hard real‑time deadlines
architect the inter‑cell data‑flow layer — ensuring that the information each cell needs to model interference from its neighbours is compressed
and consumed within the available NVLink and NIC budgets at scale
work with the propagation engine and RAN stack teams to orchestrate the end‑to‑end simulation pipeline
ensuring that propagation updates
and stack execution remain synchronised across hundreds or thousands of GPUs
assess design and implementation trade‑offs between physical fidelity
and system scalability
How You'll Work.
Team & Collaboration
work with the propagation engine and RAN stack teams
Full Job Description
We are seeking a self‑motivated senior engineer for the Aerial Omniverse Digital Twin team. This hire will own the design and implementation of the real‑time signal‑processing subsystem that converts physics‑based channel descriptions into received signals for large numbers of emulated devices, across systems of potentially thousands of interconnected GPUs. This position offers the opportunity to work on foundational technology for 5G and 6G network simulation, using NVIDIA's world‑class compute and interconnect platforms! **What you 'll be doing:** As a member of NVIDIA's Aerial team, you will design and implement GPU kernels that apply time‑varying, multi‑antenna channels to OFDM signals under hard real‑time deadlines. You will architect the inter‑cell data‑flow layer — ensuring that the information each cell needs to model interference from its neighbours is compressed, transported, and consumed within the available NVLink and NIC budgets at scale. You will work with the propagation engine and RAN stack teams to orchestrate the end‑to‑end simulation pipeline, ensuring that propagation updates, channel application, and stack execution remain synchronised across hundreds or thousands of GPUs. You will assess design and implementation trade‑offs between physical fidelity, latency, and system scalability. **What we need to see:** * PhD in high‑performance computing, computer architecture, signal processing, or wireless communications (or equivalent experience). * 12+ years of proven experience. * Proficiency in CUDA kernel design with attention to memory hierarchy, register pressure, and HBM bandwidth planning, with a track record of writing production‑quality GPU code that meets hard real‑time deadlines. * Demonstrated ability to build and reason about data flows across multi‑device GPU systems (NVLink, NIC/RDMA) with explicit bandwidth and latency accounting. * Working knowledge of OFDM signal processing and the 5G NR physical layer, sufficient to implement and val
Applying for this Senior Multi‑GPU Signal Processing and System Architecture Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about Nvidia?
Real rants from real employees. Read before you apply.