Company
AI
Sr.AIInferenceSystemsEngineer
“Sr. AI Inference Systems Engineer. Skills: AI Inference Optimization, Heterogeneous Computing, Large Model Inference, KV Cache, Router Architecture, Hardware Accelerator Tuning, Distributed Systems, CUDA, Triton. Lead the optimization of the full inference pipeline for Large Models (LLM, Multimodal). Conduct in-depth research into the underlying inference logic of various hardware accelerators”
What You'll Achieve.
Maximize throughput; Minimize latency
Industry & Context.
Resolve long-tail issues such as communication latency and load imbalance in distributed inference; Overcome key technical bottlenecks in inference design
What They're Looking For.
Must Have
Master’s or Ph.D. in Computer Science, Electronic Engineering, AI, or related significant professional experience in AI inference optimization or heterogeneous computing, Proficient in at least one AI accelerator architecture, Mastery of core inference optimization techniques, including multi-level KV Cache management, Quantization, and Intelligent Routing, Expert in parallel computing and distributed systems, Deep understanding of low-level programming models (e.g., CUDA, Triton) and inference engine architectures, Familiar with mainstream deep learning frameworks (e.g., PyTorch, TensorFlow)
Nice to Have
Experience in optimizing ultra-large-scale models, Experience in tuning ultra-large-scale inference clusters, Driving AI inference high-level publications or core patents in relevant fields
What You'll Do.
Lead the optimization of the full inference pipeline for Large Models (LLM
Conduct in-depth research into the underlying inference logic of various hardware accelerators
Design and implement high-performance inference optimize scheduling and memory management
Track global advancements in inference technology
Drive the productization of emerging technologies within production environments
Lead efforts to overcome key technical bottlenecks in inference design
Develop standardized optimization schemes
How You'll Work.
Team & Collaboration
Cross-team collaboration skills; Collaborative operator optimization
Process & Methodology
Proven track record of leading complex inference projects to fruition
Applying for this Sr. AI Inference Systems Engineer role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about this company?
Real rants from real employees. Read before you apply.