Mindbeam
AI
MachineLearningEngineer-Kernels
Neural analysis suggests this role is
optimal for Mid candidates.
“Machine Learning Engineer - Kernels at Mindbeam. Skills: custom GPU/accelerator kernels development, low-level optimization, performance tuning. Design and implement custom GPU/accelerator kernels to maximize performance. Profile, benchmark, and optimize critical ML workloads”
What You'll Achieve.
Push the boundaries of performance by developing custom kernels and low-level optimizations for next-generation AI workloads
Industry & Context.
challenge of squeezing out every ounce of compute efficiency
What They're Looking For.
Must Have
2+ years of experience in GPU programming, parallel computing, or systems-level optimization, coding skills in C++, CUDA, or similar languages, Familiarity with ML frameworks and their low-level backends, Experience optimizing workloads for distributed and heterogeneous compute environments, Comfort with profiling tools and performance diagnostics
What You'll Do.
Design and implement custom GPU/accelerator kernels to maximize performance
and optimize critical ML workloads
Collaborate with researchers to translate algorithmic advances into efficient
production-ready code
Stay current with hardware advancements (CUDA
TPU) to inform kernel design
Document and share best practices for low-level optimization
How You'll Work.
Team & Collaboration
Collaborate with researchers to translate algorithmic advances into efficient, production-ready code; thrive in a collaborative environment
Full Job Description
About Mindbeam We are building the next-generation AI infrastructure for open source and enterprise. Our work is deeply research-oriented and passionate about developing ground-breaking innovations to take state-of-the-art AI applications to the next level. Mission Push the boundaries of performance by developing custom kernels and low-level optimizations for next-generation AI workloads. Role Expectations • Design and implement custom GPU/accelerator kernels to maximize performance. • Profile, benchmark, and optimize critical ML workloads. • Collaborate with researchers to translate algorithmic advances into efficient, production-ready code. • Stay current with hardware advancements (CUDA, ROCm, TPU) to inform kernel design. • Document and share best practices for low-level optimization. Background • Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or related field—or equivalent experience. • 2+ years of experience in GPU programming, parallel computing, or systems-level optimization. • Strong coding skills in C++, CUDA, or similar languages. • Familiarity with ML frameworks and their low-level backends. • Experience optimizing workloads for distributed and heterogeneous compute environments. • Comfort with profiling tools and performance diagnostics. About You You are detail-oriented, performance-obsessed, and excited by the challenge of squeezing out every ounce of compute efficiency. You enjoy working at the intersection of algorithms and hardware, and you thrive in a collaborative environment where bold ideas are encouraged.
Applying for this Machine Learning Engineer - Kernels role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Ashby
- Ashby is a fast modern ATS — most applications take under 3 minutes.
- The resume parser is strong; verify parsed experience dates and job titles.
- Custom screening questions are often scored algorithmically — answer completely.
- Location field affects geo-based screening; use your actual metro area.
ANONYMOUS · UNFILTERED
What do employees actually say about Mindbeam?
Real rants from real employees. Read before you apply.