One Thing
Technology
PrincipalEngineer,MachineLearning,SMAI
Neural analysis suggests this role is
optimal for Principal candidates.
“Principal Engineer, Machine Learning, SMAI at One Thing. Skills: Machine Learning, GenAI, Agentic AI, Large Language Models. Architect model training jobs. Execute model training jobs”
What You'll Achieve.
Deliver industry-winning ML solutions; Deliver custom GenAI solutions; Deliver Agentic AI solutions; Power Micron's dominance; Drive value from manufacturing processes; Drive insight from manufacturing systems
Industry & Context.
Analytical thinking; Problem solving
What They're Looking For.
Must Have
Technical Degree required, 9+ years building scalable ETL pipelines, 9+ years of experience with big data processing, 9+ years developing applications and data sources, Deep understanding of GPU architecture, Experience managing GPU resources, Hands-on experience with DDP, Hands-on experience with FSDP, Hands-on experience with model parallelism techniques, Proficiency in fine-tuning LLMs using PEFT, Optimizing inference engines, Experience developing GenAI applications, Experience developing AI Agents, Proficiency with Large Language Models, Experience building end-to-end ML systems, Familiarity with machine learning frameworks, Software development skills, Scripting and programming skills in Python or Java, Experience with CI/CD tools, Outstanding analytical thinking, Outstanding interpersonal skills, Outstanding oral communication skills, Outstanding written communication skills, Ability to prioritize, Ability to meet critical project timelines
Nice to Have
Computer Science or Statistics background highly desired, Experience with HPC job schedulers, Experience orchestrating GPU workloads on Kubernetes, Knowledge of lower-level optimization, Experience with Multi-Agent Systems, Experience orchestrating collaboration between agents, Deep knowledge of math, Deep knowledge of probability, Deep knowledge of statistics, Deep knowledge of algorithms, Demonstrated ability to study data science prototypes, Demonstrated ability to transform data science prototypes, Knowledge of computer vision, Knowledge of signal processing
What You'll Do.
Architect model training jobs
Execute model training jobs
Execute fine-tuning jobs
Optimize training throughput
Optimize memory efficiency
Automate manufacturing workflows
Implement Agentic frameworks
Orchestrate LLM interactions
Profile GPU performance
Debug GPU performance bottlenecks
Maximize hardware utilization
Maintain data pipelines
Build solution pipelines
Maintain solution pipelines
Feed machine learning models
Feed GenAI applications
Design data structures
Optimize data structures
Enable AI/ML solutions
Enable Agentic solutions
Create CI/CD pipelines
Maintain CI/CD pipelines
How You'll Work.
Team & Collaboration
Collaborate with Data Scientists; Collaborate with Data Engineers; Collaborate with expert users
Communication Scope
Oral communication; Written communication
Process & Methodology
CI/CD, Agile
Full Job Description
**Our vision is to transform how the world uses information to enrich life for all.** Join an inclusive team passionate about one thing: using their expertise in the relentless pursuit of innovation for customers and partners. The solutions we build help make everything from virtual reality experiences to breakthroughs in neural networks possible. We do it all while committing to integrity, sustainability, and giving back to our communities. Because doing so can fuel the very innovation we are pursuing. The Smart Manufacturing and AI team at Micron Technology is looking for an ambitious Machine Learning Engineer (Principal Engineer). Our mission is to deliver industry-winning machine learning, custom GenAI, and Agentic AI solutions to power Micron’s dominance in the highly competitive memory solutions market. Qualified applicants will have experience in a variety of data and cloud technologies and have extensive practice modeling data, querying, and deploying scalable data pipelines to execute machine learning models and AI agents. You will collaborate with Data Scientists, Data Engineers, and expert users to build and deploy scalable AI/ML solutions that drive value and insight from Micron’s manufacturing processes and systems. **Responsibilities include, but not limited to:** * Architect and execute large-scale custom model training and fine-tuning jobs (SFT, RLHF) on multi-node, multi-GPU clusters. * Optimize training throughput and memory efficiency using distributed training strategies (FSDP, DeepSpeed, Megatron-LM) and mixed-precision techniques (FP16/BF16). * Design and develop autonomous AI Agents capable of multi-step reasoning, planning, and tool execution to automate complex manufacturing workflows. * Implement Agentic frameworks (e.g., LangChain, LangGraph, CrewAI) to orchestrate LLM interactions with internal APIs, databases, and software tools. * Profile and debug GPU performance bottlenecks using tools like Nsight Systems or PyTorch Profiler to maximi
Applying for this Principal Engineer, Machine Learning, SMAI role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
How to Apply on Workday
- Workday has a multi-step form — save your progress after every section.
- "Apply With LinkedIn" can fail or lose data; manual entry is more reliable.
- Watch for the "Submit for Review" final step — hitting "Save" alone does not submit.
- Job requisition numbers are useful when following up with HR by email.
ANONYMOUS · UNFILTERED
What do employees actually say about One Thing?
Real rants from real employees. Read before you apply.