Amazon Development Center U.S., Inc.

Technology

SoftwareDevelopmentEngineer,EC2InstanceNetworking

$165–224k Santa Clara, California, United States FULL TIME
Market Sentiment
HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Software Development Engineer, EC2 Instance Networking at Amazon Development Center U.S., Inc.. Skills: Networking software, RDMA, RoCE, AI infrastructure. Design networking software solutions. Develop networking software solutions”

Industry & Context.

Technology
Problems you'll solve

Debugging; Problem-solving skills

What They're Looking For.

Must Have

3+ years software development experience, 2+ years system design experience, C/C++ programming skills, RDMA technologies experience, Linux networking experience, Kernel development experience, Distributed systems experience, HPC clusters knowledge, Parallel programming knowledge

Nice to Have

3+ years full SDLC experience, SmartNIC programming experience, AI training infrastructure knowledge, Multi-rack cluster networking knowledge, Performance optimization experience, Benchmarking experience, System-level debugging experience, AI accelerator architectures knowledge, Scale-out communication patterns knowledge, Cloud infrastructure integration experience, Virtualization technologies experience, Problem-solving skills, Complex distributed systems experience, Algorithm design proficiency, Data structures proficiency, Linux operating system knowledge, Develop complex software systems experience, Professional software engineering practices knowledge, Best practices for SDLC knowledge, Ability to scope requirements, Ability to launch projects, Experience communicating with users, Experience communicating with technical teams, Experience communicating with management, Mentoring junior engineers experience, Driving engineering excellence experience

What You'll Do.

Design networking software solutions

Develop networking software solutions

Integrate SmartNIC acceleration hardware

Optimize collective communication patterns

Develop performance monitoring tools

Develop metrics collection tools

Develop benchmarking tools

Create automated testing frameworks

Create stress testing tools

Debug system-level issues

Collaborate on architecture decisions

Participate in design reviews

Participate in code reviews

Create technical documentation

How You'll Work.

Team & Collaboration

Cross-functional teams; Technical teams; Management

Communication Scope

Technical documentation; User communication; Team communication; Management communication

Process & Methodology

Scoping requirements, Project launch

Full Job Description

Join our team building the scale-out networking backbone that powers the world's largest AI training clusters. We're developing high-performance RDMA and RoCE solutions that enable distributed training of trillion-parameter models across thousands of compute nodes on AWS infrastructure. Our team is responsible for creating the networking software that connects massive AI accelerator clusters, focusing on SmartNIC integration, collective communication optimization, and ultra-high-bandwidth inter-rack connectivity. You'll be working at the intersection of cloud infrastructure and state-of-the-art AI hardware to solve some of the most challenging networking problems in distributed computing. Key job responsibilities * Design and develop high-performance networking software solutions utilizing RDMA and RoCE technologies for large-scale AI clusters * Integrate SmartNIC acceleration hardware with EC2 control plane systems and APIs * Implement and optimize collective communication patterns for distributed AI training workloads * Develop comprehensive performance monitoring, metrics collection, and benchmarking tools for high-bandwidth cluster interconnects * Create automated testing frameworks and stress testing tools for multi-rack distributed systems * Debug complex system-level issues across hardware acceleration, kernel networking, and distributed applications * Collaborate on architecture decisions for next-generation scale-out AI infrastructure * Participate in design reviews, code reviews, and technical documentation About the team Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Thi

Free ATS check

Applying for this Software Development Engineer, EC2 Instance Networking role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

ANONYMOUS · UNFILTERED

What do employees actually say about Amazon Development Center U.S., Inc.?

Real rants from real employees. Read before you apply.

Read Company Rants →