Software Development Engineer AI/ML, Inference Serving, AWS Neuron

4 weeks ago


City of Brisbane, Australia Amazon Full time

Software Development Engineer AI/ML, Inference Serving, AWS Neuron

AWS Neuron is the software stack powering AWS Inferentia and Trainium machine learning accelerators, designed to deliver high-performance, low-cost inference at scale. The Neuron Serving team develops infrastructure to serve modern machine learning models—including large language models (LLMs) and multimodal workloads—reliably and efficiently on AWS silicon. We are seeking a Software Development Engineer to lead and architect our next-generation model serving infrastructure, with a particular focus on large-scale generative AI applications.

Key job responsibilities

- Architect and lead the design of distributed ML serving systems optimized for generative AI workloads

- Drive technical excellence in performance optimization and system reliability across the Neuron ecosystem

- Design and implement scalable solutions for both offline and online inference workloads

- Lead integration efforts with frameworks such as vLLM, SGLang, Torch XLA, TensorRT, and Triton

- Develop and optimize system components for tensor/data parallelism and disaggregated serving

- Implement and optimize custom PyTorch operators and NKI kernels

- Mentor team members and provide technical leadership across multiple work streams

- Drive architectural decisions that impact the entire Neuron serving stack

- Collaborate with customers, product owners, and engineering teams to define technical strategy

- Author technical documentation, design proposals, and architectural guidelines

A day in the life

You'll lead critical technical initiatives while mentoring team members. You'll collaborate with cross‑functional teams of applied scientists, system engineers, and product managers to architect and deliver state‑of‑the‑art inference capabilities. Your day might involve:

- Leading design reviews and architectural discussions

- Rapidly prototyping software to show customer value

- Debugging complex performance issues across the stack

- Mentoring junior engineers on system design and optimization

- Collaborating with research teams on new ML serving capabilities

- Driving technical decisions that shape the future of Neuron's inference stack

About the team

The Neuron Serving team is at the forefront of scalable and resilient AI infrastructure at AWS. We focus on developing model‑agnostic inference innovations, including disaggregated serving, distributed KV cache management, CPU offloading, and container‑native solutions. Our team is dedicated to upstreaming Neuron SDK contributions to the open‑source community, enhancing performance and scalability for AI workloads. We're committed to pushing the boundaries of what's possible in large‑scale ML serving.

Recent shares:

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd-inference/developer_guides/disaggregated-inference.html

Basic Qualifications

- 5+ years of programming using a modern programming language such as Java, C++, or C#, including object‑oriented design experience

- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience

- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience

- 5+ years of non‑internship professional software development experience

- Experience as a mentor, tech lead or leading an engineering team

Preferred Qualifications

- Master's degree in computer science or equivalent

- Deep expertise in ML Frameworks/Libraries such as JAX, PyTorch, vLLM, SGLang, Dynamo, TorchXLA, TensorRT.

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Los Angeles County applicants: Job duties for this position include work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $151,300/year in our lowest geographic market up to $261,500/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job‑related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign‑on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.

#J-18808-Ljbffr


  • AI ML Engineer

    3 weeks ago


    City of Melbourne, Australia Infosys Singapore & Australia Full time

    Infosys Consulting is the worldwide management and IT consultancy unit of the Infosys Group (NYSE: INFY), a global advisor to leading companies for strategy, process engineering and technology-enabled transformation programs. We partner with clients to design and implement customized solutions to address their complex business challenges, and to help them...

  • Applied ML Engineer

    6 days ago


    City of Brisbane, Australia King River Capital Group Full time

    About Splash At Splash, our mission is to make music creation accessible for everyone. Since 2017, we’ve been pioneering the intersection of artificial intelligence and music, creating tools that empower young creators and music enthusiasts. Our experiences, particularly on platforms like Roblox, have inspired millions to engage with music in new...

  • Software Developer

    3 days ago


    Council of the City of Sydney, Australia Susquehanna International Group, LLP Full time

    Overview At Susquehanna, we build technology that drives global financial markets. Within our Core Trading Technology (CTT) team, you’ll design and integrate AI-driven solutions that enhance operational efficiency across our trading and support systems — directly shaping how our traders and engineers work every day. In this role, you’ll lead the...


  • City of Melbourne, Australia Shield AI Full time

    Overview Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission of protecting service members and civilians with intelligent systems. Its products include the V-BAT and X-BAT aircraft, Hivemind Enterprise, and the Hivemind Vision product lines. With nine offices and facilities across the U.S., Europe, the Middle East, and the...


  • Council of the City of Sydney, Australia Mary Technology Full time

    About Us At Mary Technology, we’re tackling one of the legal industry’s biggest challenges: Fact Chaos — the overwhelming disorder that arises from managing mountains of unstructured information.We’re building intuitive, scalable tools that help lawyers spend less time untangling facts and more time on high-value thinking. From visualising case...


  • Council of the City of Sydney, Australia Mary Technology Full time

    A leading legal tech company in New South Wales is seeking a Senior Machine Learning Software Engineer. In this role, you will focus on fine-tuning and deploying transformer models, building scalable inference systems, and leading experimental initiatives to drive product innovation. A strong background in ML and transformer pipelines is essential for this...


  • City of Brisbane, Australia Edison Talent Full time

    Full-time | Hybrid | Brisbane The Opportunity Our client is a fast-growing technology company dedicated to transforming the way healthcare clinics grow, stay compliant, and deliver exceptional care. With a next-generation SaaS platform, they empower healthcare professionals to focus on what matters most: improving patient outcomes. This is an exciting...

  • AI Engineer

    3 weeks ago


    City of Melbourne, Australia Reece Ltd. Full time

    AI Engineer page is loaded## AI Engineerlocations: VIC - Cremorne - 57 Balmain Sttime type: Full timeposted on: Posted Todaytime left to apply: End Date: 19 December 2025 (20 days left to apply)job requisition id: R-00027454AI Engineer**About Reece**The Reece name is probably best known for plumbing. But we’re much more than that. We're radically...


  • City of Melbourne, Australia Cevo trades as Cevo (VIC) Pty Full time

    Who are we? Cevois a trusted leader in technology delivery on the AWS platform. Established in 2016,we'rean Advanced AWS Consulting Partner, headquartered across Sydney, Melbourne,Brisbaneand Canberra. We empower our customers to simplify cloud adoption and accelerate digital transformation byleveragingemerging technologies, modern productivitytoolsand...

  • Senior AI Engineer

    6 days ago


    City of Brisbane, Australia Virgin Australia Airlines Full time

    We're on a mission to bring new and uplifting experiences to our guests, transforming flying from simply 'nice', to something wonderful. Like any good mission, this one starts with our people. Which is why we encourage our team to embody the fun, laid back, authentic spirit we've become famous for. Because when they're free to be themselves, they're better...