MatX
AI
RuntimeEngineer
Neural analysis suggests this role is
optimal for Mid candidates.
“Runtime Engineer at MatX. Skills: systems programming, Python interop, API/ABI contracts, accelerator programming. Build host-side interface library. Own and extend executable format”
What You'll Achieve.
measurable performance targets
Industry & Context.
U. S. export controls
What They're Looking For.
Must Have
systems programming language, memory management, allocator design, FFI/ABI work, Python interop layers, API or ABI contracts, accelerator programming model, ML-systems literate
Nice to Have
LLM inference internals, Rust at depth, Custom allocator design, ML framework integration, Profiler or tracing infrastructure, Driver-adjacent or kernel-bypass work, new-silicon bring-up
What You'll Do.
Build host-side interface library
Own and extend executable format
Design custom-kernel ABI
Build Python bindings
Build LLM inference serving stack
Bring up interconnect topology
Design chip profilers
Hit measurable performance targets
How You'll Work.
Team & Collaboration
contracts that bind teams together
Full Job Description
What MatX is Building MatX is building custom silicon for large-language-model inference and training, with HW/SW co-design across ISA, RTL, simulator, compiler, and kernels so each layer benefits from the others. The runtime owns the host-side stack and the contracts that bind those teams together. What You'll Do Here Build the host-side interface library — device memory management, DMA, streams and events, sync primitives — that every compiler-emitted program runs on top of Own and extend the executable format: the compiler→runtime contract, its versioning, the weight and quantization layouts that let compiler and runtime evolve independently Design the custom-kernel ABI — calling convention, sync semantics, lifecycle — and the host-side marshaling layer (DLPack, the buffer protocol, numpy) that gets Python tensors to the device Build Python bindings via PyO3, with a C-ABI shim as the alternative integration path for downstream consumers Build the LLM inference serving stack — paged KV cache, continuous batching, request scheduling, token streaming — and the cluster orchestration primitives underneath it Bring up interconnect topology from the host and own the failure-detection and clean-teardown path for stop-restructure-resume recovery across racks Design what the chip exposes to host-side profilers and debuggers — perf counters, traces, and the Python surfaces ML engineers actually use — and hit measurable performance targets on runtime overhead and serving throughput Who You Are Strong experience in a systems programming language — Rust, C, C++, or Go — including memory management, allocator design, and FFI/ABI work Have built Python interop layers in production (PyO3, ctypes, pybind11, or equivalent C-ABI bridging) Have designed and maintained API or ABI contracts between teams — versioning, evolution, breaking-change discipline — not just consumed someone else's Hands-on with at least one accelerator programming model (CUDA, ROCm, oneAPI Level Zero, TPU, or
Applying for this Runtime Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Greenhouse
- Create a Greenhouse profile before applying — it saves time across multiple applications.
- Upload your resume as a PDF; the parser handles it better than Word.
- Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
- Enable email notifications to track application status in real time.
ANONYMOUS · UNFILTERED
What do employees actually say about MatX?
Real rants from real employees. Read before you apply.