
Lead Site Reliability Engineer
5 days ago
At Xero, we're here to help you supercharge your business. We do this by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, advisors and apps. When that happens, we're not only making life better for small business, we'll be building a stronger economy that can change the world.
About the team
Xero's Incident and Problem Management team is part of the Site Reliability Engineering (SRE) organization and is responsible for the build, delivery and ongoing maintenance of robust processes and tooling around Incident management.
The team is responsible for driving enduring reliability at Xero through robust, consistent and fast response to high severity incidents. They are responsible for building a world-class process and ensuring that that process matures as the demands of the business grow.
About the roles
We\'re looking for a Lead Engineer to join Xero's Incident and Problem Management team. This position requires an experienced SRE professional with a strong technical background, deep experience in SRE, a passion for building and delivering robust processes, and extensive experience of leading technical response to high severity cloud issues.
You will drive best practice across the business and contribute to the ongoing transformation of the Xero SRE culture. As an expert communicator, you will lead technical discussions to identify and track actions associated with and identified during incident situations.
Across our SRE function, we\'re looking for those who are keen to deep dive into causes of incidents and proactively examine the potential causes of future incidents; working with engineering teams to remove the risk of that failure scenario. Ultimately building playbooks and automation to ensure quick and effective responses. In addition, provide ongoing training across the business to ensure the process is well understood and adhered to.
This role will form the backbone of a new team, providing a Technical Duty Officer (TDO) function within the business. TDOs are incident commanders who use SRE skillsets to drive fast mitigation and enduring resolution of impactful events.
What you\'ll do
- Own the incident management process, ensuring it drives enduring reliability across all products and services within Xero.
- Provide expert leadership during critical outages, coordinating multiple teams to ensure streamlined decision-making and quick resolution.
- Lead and advocate for the transformation to a world-leading SRE organization, promoting SRE principles within the Engineering Department.
- Promote a customer-focused approach by addressing and mitigating global customer environment issues, and fostering a culture of continuous learning and technical excellence within the SRE team.
- Develop and implement scalable process frameworks and observability strategies to ensure rapid problem diagnosis, response, and service reliability.
- Collaborate with product teams to thoroughly analyze failures and integrate insights to improve service reliability, scalability, and operational efficiency.
What you\'ll bring
- Previous career experience as a Site Reliability Engineer, in an Operations or Engineering environment
- Strong hands-on coding experience (preferably Python) and knowledge of software engineering best practice
- Networking knowledge and able to troubleshoot TCP/IP, SSL/TLS, DNSSEC, IPsec, and BGP issues
- Strong communication (oral & written) skills including the ability to translate technical issues/concepts into agreed actions
Why Xero?
Offering very generous paid leave to use however you'd like (plus statutory holidays), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family. Health insurance, life insurance, and income protection.
We offer wellbeing and sports programmes, employee resource groups, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices, flexible working, career development, and many other benefits that reflect our human value.
You'll do the best work of your life at Xero
Seniority level
Not Applicable
Employment type
Full-time
Job function
Engineering and Information Technology
Industries
Software Development
#J-18808-Ljbffr-
Lead Site Reliability Engineer
5 days ago
Melbourne, Victoria, Australia Xero Full timeLead Site Reliability Engineer (Technical Duty Officer)At Xero, we're here to help you supercharge your business. We do this by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, advisors and apps. When that happens, we're not only making life better for small business, we'll be building a stronger economy...
-
Site Reliability Engineers
3 weeks ago
Melbourne, Victoria, Australia Xero Full timeSite Reliability Engineers (Observability) Join to apply for the Site Reliability Engineers (Observability) role at Xero Site Reliability Engineers (Observability) Join to apply for the Site Reliability Engineers (Observability) role at Xero Get AI-powered advice on this job and more exclusive features.At Xero, we're here to help you supercharge your...
-
Site Reliability Engineers
3 weeks ago
Melbourne, Victoria, Australia Xero Full timeSite Reliability Engineers (Observability)Join to apply for the Site Reliability Engineers (Observability) role at XeroSite Reliability Engineers (Observability)Join to apply for the Site Reliability Engineers (Observability) role at XeroGet AI-powered advice on this job and more exclusive features.At Xero, we're here to help you supercharge your business....
-
Site Reliability Engineer
2 days ago
Melbourne, Victoria, Australia Infosys Full timeAbout InfosysInfosys is a global leader in next-generation digital services and consulting. We enable clients in more than 56 countries to navigate their digital transformation. With over four decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the...
-
Site Reliability Engineer
5 days ago
Melbourne, Victoria, Australia Infosys Full timeAbout InfosysInfosys is a global leader in next-generation digital services and consulting. We enable clients in more than 56 countries to navigate their digital transformation. With over four decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the...
-
Site Reliability Engineer
5 days ago
Melbourne, Victoria, Australia Infosys Full timeAbout Infosys Infosys is a global leader in next-generation digital services and consulting. We enable clients in more than 56 countries to navigate their digital transformation. With over four decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling...
-
Site Reliability Lead
2 days ago
Melbourne, Victoria, Australia beBeeEngineer Full time $108,571 - $119,893About Us:We enable clients to navigate their digital transformation. With over four decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the enterprise with an AI-powered core that helps prioritize the execution of change. We also empower the...
-
Site Reliability Engineer
1 day ago
Melbourne, Victoria, Australia Bupaoptical Full timeAbout the RoleWe are seeking a Site Reliability Engineer (SRE) to own the stability, observability, and reliability of our non-production and production environments that support our mobile app delivery and customers. This role is responsible for ensuring development, integration, pre-production, and production environments remain healthy, available, and...
-
Site Reliability
13 hours ago
Melbourne, Victoria, Australia Canonical Full timeJoin to apply for the Site Reliability / Gitops Engineer role at Canonical 1 day ago Be among the first 25 applicants Join to apply for the Site Reliability / Gitops Engineer role at Canonical Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubuntu, is very widely...
-
Senior Site Reliability Engineer, APAC
4 weeks ago
Melbourne, Victoria, Australia Ditto Full timeJoin to apply for the Senior Site Reliability Engineer, APAC role at Ditto1 day ago Be among the first 25 applicantsJoin to apply for the Senior Site Reliability Engineer, APAC role at DittoAbout Ditto:Ditto is redefining how data moves at the edge. Our mission is to make it seamless for developers to build resilient, real-time applications, regardless of...