Amazon Development Center U.S., Inc.
Technology
SeniorSoftwareDevelopmentEngineer,EC2TrainiumAIInfra
Neural analysis suggests this role is
optimal for Senior candidates.
“Senior Software Development Engineer, EC2 Trainium AI Infra at Amazon Development Center U.S., Inc.. Skills: AI infrastructure, Cloud computing. Lead technical strategy. Design infrastructure services”
Industry & Context.
Investigate performance challenges; Develop solutions
What They're Looking For.
Must Have
5+ years software development, 5+ years programming, 5+ years architecture design, Experience as mentor, Experience as tech lead, Experience leading engineering team
Nice to Have
5+ years full SDLC, Coding standards experience, Code reviews experience, Source control management experience, Build processes experience, Testing experience, Operations experience
What You'll Do.
Lead technical strategy
Design infrastructure services
Build infrastructure services
Operate infrastructure services
Provision AWS Trainium servers
Ensure availability of servers
Architect large-scale systems
Manage AI/ML infrastructure
Develop innovative technologies
Power infrastructure for AI workloads
Lead technical projects
Establish EC2 as pioneer
Influence architecture of provisioning systems
Improve provisioning systems
Operate provisioning systems efficiently
Investigate performance challenges
Develop solutions for challenges
Publish best practices
Manage end-to-end provisioning workflows
Manage host ingestion
How You'll Work.
Team & Collaboration
Cross-functional collaboration; Collaborate with capacity management; Collaborate with hardware engineering; Collaborate with datacenter teams
Communication Scope
Publish best practices
Full Job Description
The Software Development Engineer will lead the team in technical strategy, design, build, and operation of infrastructure services including provisioning and availability of AWS Trainium-based AI servers. This role requires expertise in architecting large-scale systems, building micro services, and cross-functional collaboration with several other teams such as capacity management, hardware engineering, and datacenter teams to manage AI/ML infrastructure. Key job responsibilities - Design and develop innovative technologies that power the infrastructure supporting AI workloads on Ultraservers - Lead technical projects establishing EC2 as the pioneer in cloud computing for AI/ML workloads across diverse applications including LLMs, multimodal systems, and emerging model architectures. - Collaborate with various teams to influence architecture of provisioning systems and improve to operate at scale and efficiently. - Build customer relationships by investigating complex performance challenges, developing solutions, and publishing actionable best practices through multiple channels. About the team The EC2 UltraServer Provisioning team is a high-performing engineering organization responsible for delivering AWS Trainium-based UltraServers infrastructure at scale. We manage end-to-end provisioning workflows from host ingestion through testing, repair, and recovery. Basic Qualifications: - 5+ years of non-internship professional software development experience - 5+ years of programming with at least one software programming language experience - 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience - Experience as a mentor, tech lead or leading an engineering team Preferred Qualifications: - 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience - Bachelor's degree in computer science or
Applying for this Senior Software Development Engineer, EC2 Trainium AI Infra role?
Most applicants get filtered before a human reads their resume. See if yours makes the cut.
ANONYMOUS · UNFILTERED
What do employees actually say about Amazon Development Center U.S., Inc.?
Real rants from real employees. Read before you apply.