Mindbeam

AI

MachineLearningEngineer-Kernels

$150–190k United States FULL TIME Remote Friendly
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid candidates.

The Brief

“Machine Learning Engineer - Kernels at Mindbeam. Skills: custom GPU/accelerator kernels development, low-level optimization, performance tuning. Design and implement custom GPU/accelerator kernels to maximize performance. Profile, benchmark, and optimize critical ML workloads”

What You'll Achieve.

Push the boundaries of performance by developing custom kernels and low-level optimizations for next-generation AI workloads

Industry & Context.

AI
Problems you'll solve

challenge of squeezing out every ounce of compute efficiency

What They're Looking For.

Must Have

2+ years of experience in GPU programming, parallel computing, or systems-level optimization, coding skills in C++, CUDA, or similar languages, Familiarity with ML frameworks and their low-level backends, Experience optimizing workloads for distributed and heterogeneous compute environments, Comfort with profiling tools and performance diagnostics

What You'll Do.

Design and implement custom GPU/accelerator kernels to maximize performance

and optimize critical ML workloads

Collaborate with researchers to translate algorithmic advances into efficient

production-ready code

Stay current with hardware advancements (CUDA

TPU) to inform kernel design

Document and share best practices for low-level optimization

How You'll Work.

Team & Collaboration

Collaborate with researchers to translate algorithmic advances into efficient, production-ready code; thrive in a collaborative environment

Full Job Description

About Mindbeam We are building the next-generation AI infrastructure for open source and enterprise. Our work is deeply research-oriented and passionate about developing ground-breaking innovations to take state-of-the-art AI applications to the next level. Mission Push the boundaries of performance by developing custom kernels and low-level optimizations for next-generation AI workloads. Role Expectations • Design and implement custom GPU/accelerator kernels to maximize performance. • Profile, benchmark, and optimize critical ML workloads. • Collaborate with researchers to translate algorithmic advances into efficient, production-ready code. • Stay current with hardware advancements (CUDA, ROCm, TPU) to inform kernel design. • Document and share best practices for low-level optimization. Background • Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or related field—or equivalent experience. • 2+ years of experience in GPU programming, parallel computing, or systems-level optimization. • Strong coding skills in C++, CUDA, or similar languages. • Familiarity with ML frameworks and their low-level backends. • Experience optimizing workloads for distributed and heterogeneous compute environments. • Comfort with profiling tools and performance diagnostics. About You You are detail-oriented, performance-obsessed, and excited by the challenge of squeezing out every ounce of compute efficiency. You enjoy working at the intersection of algorithms and hardware, and you thrive in a collaborative environment where bold ideas are encouraged.

Free ATS check

Applying for this Machine Learning Engineer - Kernels role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Ashby

  • Ashby is a fast modern ATS — most applications take under 3 minutes.
  • The resume parser is strong; verify parsed experience dates and job titles.
  • Custom screening questions are often scored algorithmically — answer completely.
  • Location field affects geo-based screening; use your actual metro area.

ANONYMOUS · UNFILTERED

What do employees actually say about Mindbeam?

Real rants from real employees. Read before you apply.

Read Company Rants →