System Reliability Specialist

6 days ago


Melbourne, Victoria, Australia beBeeReliability Full time $102,921 - $145,562
About the Role

Our organization seeks a highly skilled individual to assume responsibility for ensuring the stability, observability, and reliability of our non-production and production environments that support our mobile app delivery and customer engagement.

This position is responsible for guaranteeing development, integration, pre-production, and production environments remain healthy, available, and performant, enabling engineering teams to work efficiently without disruption, and for customers to access our services 24/7.

Key Responsibilities
  • Monitor and maintain uptime, stability, and reliability of non-production and production environments.
  • Establish and configure smoke tests across APIs and critical services to proactively detect downstream failures.
  • Design, implement, and maintain dashboards and alerting systems that provide real-time visibility into environment health and dependencies.
  • Investigate and troubleshoot incidents impacting our environments, escalating where necessary.
  • Partner with development and DevOps teams to understand CI/CD pipelines and ensure environment availability aligns with build and deployment processes.
  • Define metrics and service level objectives (SLOs) for non-production environments. Implement automation for environment validation, monitoring, and recovery.
  • Document environment architecture, monitoring configurations, and incident runbooks. Advocate for reliability best practices and contribute to a culture of proactive environment management by integrating innovative tools into our process.
Requirements
  • Tertiary qualification in an IT discipline.
  • Relevant tertiary qualification at degree level.
  • Strong experience in site reliability engineering, DevOps, or systems engineering roles.
  • Proven ability to build monitoring dashboards and alerting.
  • Hands-on experience with API testing and automation frameworks for smoke/health checks.
  • Understanding of CI/CD pipelines and build orchestration tools.
  • Knowledge of cloud infrastructure and containerized environments.
  • Familiarity with observability practices.
  • Strong troubleshooting and problem-solving skills across complex dependencies.
  • Familiarity using predictive analytics to optimize infrastructure allocation based on traffic patterns.


  • Melbourne, Victoria, Australia beBeeSystem Full time $108,571 - $119,893

    Reliable System SpecialistWe seek a skilled and dedicated Reliable System Specialist to join our team.The primary goal of this role is to ensure that production systems are reliable, performant, and scalable. This will be achieved by identifying areas for improvement and implementing solutions to reduce downtime, prevent failures, and maintain system...


  • Melbourne, Victoria, Australia beBeeReliability Full time $96,836 - $105,068

    Job Title: System Reliability SpecialistThis is a pivotal role that entails ensuring the reliability and efficiency of complex systems. We are looking for a skilled professional who can define, implement, and maintain service level objectives (SLOs), service level indicators (SLIs), and service level agreements (SLAs) to guarantee high-quality services.Main...


  • Melbourne, Victoria, Australia beBeeMechanical Full time $79,999 - $124,999

    Job Title:Mechanical Equipment SpecialistAbout the RoleAs a mechanical equipment specialist, you will be responsible for ensuring that mission-critical mechanical systems are properly commissioned, tested, and maintained.Key Responsibilities:Developing and implementing test scripts for mechanical equipment and systems.Attending vendor meetings and factory...


  • Melbourne, Victoria, Australia beBeeReliability Full time $108,571 - $119,893

    **Reliable Systems Expertise Wanted**We are seeking a seasoned expert in system reliability to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the stability, scalability, and performance of our systems.Key Responsibilities:Design and Implement Reliable SystemsDevelop Service Level Objectives (SLOs), Service Level...


  • Melbourne, Victoria, Australia beBeeReliability Full time $108,571 - $119,893

    About reliability engineering. Reliability engineering is a discipline that focuses on ensuring the reliability, scalability, and performance of complex systems.Role OverviewWe are seeking a System Reliability Engineer to join our team in Melbourne, Australia. This role will focus on defining Service Level Objectives (SLOs), monitoring system performance,...


  • Melbourne, Victoria, Australia beBeeEngineer Full time $108,571 - $119,893

    Site Reliability EngineerThe primary focus is ensuring system reliability, scalability and performance.Key Responsibilities:Define SLOs, SLIs and SLAs for reliability.Monitor system performance and reduce toil.Capacity planning scaling.Automate reliability improvements.Ensure production systems are reliable, performant and scalable.Required Skills and...


  • Melbourne, Victoria, Australia Bebeereliability Full time

    Job OverviewWe are seeking a skilled professional to take on the role of ensuring the reliability, scalability and performance of our systems.Key ResponsibilitiesDefine Service Level Objectives (SLOs), Service Level Indicators (SLIs) and Service Level Agreements (SLAs) for reliability.Monitor system performance and capacity plan to ensure optimal resource...


  • Melbourne, Victoria, Australia beBeeSystem Full time $108,893

    Reliable Systems EngineerWe are seeking a skilled and experienced System Reliability Engineer to join our team. The successful candidate will be responsible for ensuring the reliability, scalability, and performance of our production systems.Key Responsibilities:


  • Melbourne, Victoria, Australia beBeeAzure Full time $108,571 - $119,893

    Job Title: Infrastructure Reliability SpecialistAbout this Role:This role is responsible for ensuring the reliability, scalability, and performance of production systems.Key Responsibilities:Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs): Define SLOs, SLIs, and SLAs to ensure system reliability. Monitor...


  • Melbourne, Victoria, Australia Bae Systems Full time

    Systems Reliability Engineer- All Systems GO Be part of the next generation of Defence technology- Bring your leadership and expertise to exciting Defence projects- Enjoy flexibility, engaging work and a culture that embraces diversity and open-mindednessAt BAE Systems we are all systems go as we continue to drive innovation and seek passionate and talented...