Reliability Engineer

1 month ago


Sydney, New South Wales, Australia Xero Full time

Xero is a leading cloud-based accounting software company that empowers small businesses and their advisors to grow and thrive. Our purpose is to make life better for people in small business, their advisors, and communities around the world. We support our people to do the best work of their lives so that they can help small businesses succeed through better tools, information, and connections.

In our Site Reliability Engineering (SRE) team, we drive and influence Xero to provide the most reliable experience for our customers. We are a global team based across New Zealand, Australia, and the USA. Our team combines software and systems engineering to enable engineers across Xero to build and support products that are observable, stable, performant, tolerant to failure, and operate as intended in the face of varying conditions.

We strive to maximise the impact of post-incident learning across the organisation to improve the reliability and robustness of the Xero platform, while providing enablement and training across observability, reliability engineering, incident management, and service ownership.

We also enable engineers across Xero through developing, supporting, and integrating a collection of proprietary and off-the-shelf tooling to enable incident management and response, incident analysis and learning, monitoring and observability, and resource ownership. We surface data and metrics, and provide detailed insights across operational health, production operations, and developer productivity.

About the Roles

We are currently seeking Software Engineers within our Reliability Tooling and Engineering Health teams in Site Reliability Engineering (SRE). Our teams develop and integrate a collection of tools that enable teams at Xero to easily visualise and manage operations and incidents, to support reliability, operational excellence, continuous delivery, and engineering productivity at Xero. In these roles, you will have the opportunity to leverage your technical experience to drive and contribute to team deliverables and also broader SRE and Xero initiatives.

As a member of our Reliability Tooling and Engineering Health teams, you will help enable and empower Xero engineering teams to improve their engineering practices by a combination of the following:

    • Contribute to the delivery of projects aligned with team goals, solving ambiguous problems with innovative solutions.
    • Design and maintain robust software components, understanding when to refactor or maintain existing systems.
    • Make data-driven decisions, balancing various perspectives to achieve well-rounded solutions.
    • Advocate for continuous improvement of systems and processes within the team, and across the organisation.
    • Establish reliable processes for feature rollouts, monitor success metrics, and ensure system health and quality.
    • Exposure to on-call duties, including incident management and response, troubleshooting efforts, as well as conducting post-incident reviews and learning from incidents.

In order to be successful in this role, you will have:

    • Experience using software engineering to solve operational, reliability challenges, and deliver technical initiatives.
    • Proficiency in one or more object-oriented programming languages (e.g. C#, JavaScript, Java, Python) or experience with infrastructure-as-code (e.g. Terraform, Cloudformation).
    • Experience working with cloud providers such as AWS, Azure, or GCP, alongside experience with logging and monitoring tooling such as Sumo Logic and New Relic.
    • Experience with designing, developing, and operating internal developer tooling, in a complex distributed systems environment.
    • Strong experience working in a DevOps environment, preferably with exposure to more mature CICD practices and capabilities.
    • The ability to work in a cross-functional, collaborative environment and identify technical dependencies to ensure project success.

Why Xero?

We offer a range of benefits, including generous paid leave, dedicated paid leave to care for your physical and mental wellbeing, health insurance, life insurance, and income protection. We also provide wellbeing and sports programmes, employee resource groups, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices, flexible working, career development, and many other benefits that reflect our human value.



  • Sydney, New South Wales, Australia Macquarie Full time

    At Macquarie, a global financial services group operating in 34 markets, we're seeking an experienced Senior Site Reliability Engineer to join our Engineering Enablers team.We're committed to providing the most reliable products and service in the financial industry. As part of our team, you'll contribute to the delivery of software reliability and help...


  • Sydney, New South Wales, Australia Palantir Technologies Full time

    Job Description: Palantir Technologies is seeking a skilled Reliability Systems Engineer to join our Database Operations team. As a key member of this team, you will be responsible for designing, implementing, and maintaining scalable systems to support our products. Your expertise in troubleshooting complex issues and identifying areas for improvement will...


  • Sydney, New South Wales, Australia Cisco Full time

    Job SummaryWe are seeking a highly motivated and detail-oriented Site Reliability Engineer Co-Op to join our Meraki SRE team. As a key member of our global engineering team, you will work on projects spanning our server hardware, operating systems, and tools for code deployment and service monitoring.Key ResponsibilitiesCollaborate with our SRE team to...


  • Sydney, New South Wales, Australia Commonwealth Bank of Australia Full time

    Senior Site Reliability Engineer Role at Commonwealth Bank of AustraliaWe are seeking an experienced Senior Site Reliability Engineer to join our team in Sydney, Melbourne, or Perth.About the RoleThis is a key position that will play a crucial part in ensuring the reliability and performance of our systems, which support millions of customers. You will be...


  • Sydney, New South Wales, Australia SafetyCulture Full time

    SafetyCulture is a fast-growing company valued at AU$2.7Bn that helps businesses improve every day.We're constantly evolving our platform, expanding into sensors and scalable architecture, but we believe there's more to be done.As a Software Engineer in the Site Reliability Engineering team, you'll help design, build, and run resilient systems.You'll live by...


  • Sydney, New South Wales, Australia Macquarie Full time

    About the RoleWe are seeking a skilled Senior Site Reliability Engineer to join our Engineering Enablers team. As a key member of our team, you will play a vital role in delivering software reliability and contributing to the development of our products.At Macquarie, we strive to create a diverse and inclusive work environment where everyone feels valued and...


  • Sydney, New South Wales, Australia ServiceNow Full time

    About ServiceNowServiceNow is a global market leader in innovative AI-enhanced technology, serving over 8,100 customers, including 85% of the Fortune 500. Our intelligent cloud-based platform connects people, systems, and processes to empower organizations to work smarter, faster, and better.Job DescriptionWe're seeking a skilled Site Reliability Engineer to...


  • Sydney, New South Wales, Australia ServiceNow Full time

    About the RoleWe're seeking a skilled Site Reliability Engineer to join our team at ServiceNow. As a key member of our infrastructure team, you'll play a critical role in ensuring the reliability and performance of our global market-leading platform.Key ResponsibilitiesAssist in resolving complex infrastructure issues and develop sustainable...


  • Sydney, New South Wales, Australia Microsoft Full time

    Job SummaryKubernetes orchestrates applications on an infrastructure. The reliability of that infrastructure is of utmost importance to customers to ensure the availability of their applications. At Microsoft, we are working on improving the infrastructure reliability and stability so that our customers can experience the highest uptime possible in the...


  • Sydney, New South Wales, Australia Palantir Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our Database Operations team at Palantir Technologies. As a Site Reliability Engineer, you will play a critical role in scaling, maintaining, operating, and modernizing the databases behind our products.Key Responsibilities:Build expertise on pre-existing...


  • Sydney, New South Wales, Australia Talenza Full time

    Role OverviewAs a Senior Site Reliability Engineer at Talenza, you will play a pivotal role in enhancing the observability of our client's infrastructure and applications. You will work closely with engineering and operations teams to implement and refine monitoring, logging, and alerting solutions, ensuring seamless service delivery and optimal system...


  • Sydney, New South Wales, Australia Evolution Mining Full time

    Job SummaryWe are seeking a highly skilled Reliability Engineering Specialist to join our team at Evolution Mining. In this role, you will provide technical support to the Maintenance Department, focusing on analysis of data, failure analysis, and root cause investigation.The successful candidate will identify opportunities for business de-bottlenecking,...


  • Sydney, New South Wales, Australia Palantir Technologies Full time

    Transforming Data into ActionAt Palantir Technologies, we're seeking a skilled Site Reliability Engineer to join our Database Operations team. As a Site Reliability Engineer, you'll play a critical role in scaling, maintaining, operating, and modernizing the databases behind our products. Our team strives to automate processes wherever possible, using the...


  • Sydney, New South Wales, Australia Macquarie Full time

    Software Reliability Role at MacquarieWe are looking for an experienced individual to contribute to the delivery of software reliability within our Engineering Enablers team. The goal is to provide the most reliable products and services for the financial industry.About the RoleAs a key member of our team, you will be responsible for the application of...


  • Sydney, New South Wales, Australia VGW Full time

    Site Reliability Engineer SupervisorVGW is an interactive entertainment company that harnesses technology and creativity to deliver world-class, free-to-play games.We are seeking an experienced Site Reliability Engineer Supervisor to join our Engineering team in Sydney.This role will focus on ensuring the reliability of our systems as we bring new games to...


  • Sydney, New South Wales, Australia Lanson Partners Full time

    Lanson Partners is seeking a Senior Site Reliability Engineer to join our team in Sydney.Your Role:As a Senior Site Reliability Engineer, you will work closely with software engineering teams and stakeholders to ensure the health and performance of our infrastructure and software.Key Responsibilities:Apply observability principles to infrastructure,...


  • Sydney, New South Wales, Australia Talenza Full time

    Sydney-based Senior Site Reliability Engineer RoleAre you a seasoned Site Reliability Engineer with a strong background in software engineering and a passion for ensuring the reliability, scalability, and performance of critical systems?Role Overview:As a Senior Site Reliability Engineer, you'll play a pivotal role in enhancing the observability of our...


  • Sydney, New South Wales, Australia SafetyCulture Full time

    At SafetyCulture, we're driven by a mission to empower businesses to continuously improve.We've recently achieved a valuation of AU$2.7Bn, and we're investing in creating a better workplace for all. We're growing rapidly and looking for talented individuals who value collaboration, growth, and learning to join our team.The RoleAs a Software Engineer in the...


  • Sydney, New South Wales, Australia Microsoft Full time

    About the RoleWe are seeking a highly skilled Reliability Engineer to join our Azure Customer Experience (CXP) Customer Reliability Engineering (CRE) Team. As a key member of our team, you will be responsible for improving customer experience on Azure by analyzing signals from various sources and driving root cause analyses (RCAs) and service improvements...


  • Sydney, New South Wales, Australia VGW Full time

    About the Role:We are seeking an experienced Site Reliability Engineer Supervisor to join our Engineering team in Sydney. As a Site Reliability Engineer Supervisor, you will be responsible for ensuring the reliability of our systems as we bring new games to life.Key Responsibilities:Ensure the reliability of our systemsWork with the Engineering team to...