Nvidia

telecommunications

SeniorMulti‑GPUSignalProcessingandSystemArchitectureEngineer

$200–397k Santa Clara, California, United States FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Senior candidates.

The Brief

“Senior Multi‑GPU Signal Processing and System Architecture Engineer at Nvidia. Skills: GPU kernel design, CUDA, real-time system design, multi-device GPU systems, OFDM signal processing, 5G NR. design and implementation of the real‑time signal‑processing subsystem that converts physics‑based channel descriptions into received signals for large numbers of emulated devices, across systems of potentially thousands of interconnected GPUs. design and implement GPU kernels that apply time‑varying, mul”

Industry & Context.

telecommunications
Problems you'll solve

assess design and implementation trade‑offs between physical fidelity, latency, and system scalability

What They're Looking For.

Must Have

PhD in high‑performance computing, computer architecture, signal processing, or wireless communications (or equivalent experience), 12+ years of proven experience, Proficiency in CUDA kernel design with attention to memory hierarchy, register pressure, and HBM bandwidth planning, with a track record of writing production‑quality GPU code that meets hard real‑time deadlines, Demonstrated ability to build and reason about data flows across multi‑device GPU systems (NVLink, NIC/RDMA) with explicit bandwidth and latency accounting, Working knowledge of OFDM signal processing and the 5G NR physical layer, sufficient to implement and validate a channel‑emulation pipeline, Impactful publications involving GPU‑accelerated numerical workloads or real‑time system design

Nice to Have

Experience with GPU‑accelerated RAN platforms, L1/L2 software stacks, or channel emulators, Knowledge of high‑bandwidth GPU interconnects (NVLink, NVSwitch) and their scaling properties, Familiarity with massive MIMO beamformer design and MU‑MIMO precoding

What You'll Do.

design and implementation of the real‑time signal‑processing subsystem that converts physics‑based channel descriptions into received signals for large numbers of emulated devices

across systems of potentially thousands of interconnected GPUs

design and implement GPU kernels that apply time‑varying

multi‑antenna channels to OFDM signals under hard real‑time deadlines

architect the inter‑cell data‑flow layer — ensuring that the information each cell needs to model interference from its neighbours is compressed

and consumed within the available NVLink and NIC budgets at scale

work with the propagation engine and RAN stack teams to orchestrate the end‑to‑end simulation pipeline

ensuring that propagation updates

and stack execution remain synchronised across hundreds or thousands of GPUs

assess design and implementation trade‑offs between physical fidelity

and system scalability

How You'll Work.

Team & Collaboration

work with the propagation engine and RAN stack teams

Full Job Description

We are seeking a self‑motivated senior engineer for the Aerial Omniverse Digital Twin team. This hire will own the design and implementation of the real‑time signal‑processing subsystem that converts physics‑based channel descriptions into received signals for large numbers of emulated devices, across systems of potentially thousands of interconnected GPUs. This position offers the opportunity to work on foundational technology for 5G and 6G network simulation, using NVIDIA's world‑class compute and interconnect platforms! **What you 'll be doing:** As a member of NVIDIA's Aerial team, you will design and implement GPU kernels that apply time‑varying, multi‑antenna channels to OFDM signals under hard real‑time deadlines. You will architect the inter‑cell data‑flow layer — ensuring that the information each cell needs to model interference from its neighbours is compressed, transported, and consumed within the available NVLink and NIC budgets at scale. You will work with the propagation engine and RAN stack teams to orchestrate the end‑to‑end simulation pipeline, ensuring that propagation updates, channel application, and stack execution remain synchronised across hundreds or thousands of GPUs. You will assess design and implementation trade‑offs between physical fidelity, latency, and system scalability. **What we need to see:** * PhD in high‑performance computing, computer architecture, signal processing, or wireless communications (or equivalent experience). * 12+ years of proven experience. * Proficiency in CUDA kernel design with attention to memory hierarchy, register pressure, and HBM bandwidth planning, with a track record of writing production‑quality GPU code that meets hard real‑time deadlines. * Demonstrated ability to build and reason about data flows across multi‑device GPU systems (NVLink, NIC/RDMA) with explicit bandwidth and latency accounting. * Working knowledge of OFDM signal processing and the 5G NR physical layer, sufficient to implement and val

Free ATS check

Applying for this Senior Multi‑GPU Signal Processing and System Architecture Engineer role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Workday

  • Workday has a multi-step form — save your progress after every section.
  • "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
  • Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
  • Job requisition numbers are useful when following up with HR by email.

ANONYMOUS · UNFILTERED

What do employees actually say about Nvidia?

Real rants from real employees. Read before you apply.

Read Company Rants →