Site Reliability Engineer

1 week ago


Sydney, New South Wales, Australia TikTok Full time

Responsibilities
About the Team:

Site Reliability Engineering (SRE) of the AML (Applied Machine Learning) team combines system engineering and the art of machine learning to develop and run a massively distributed AI/ML recommendation system for the United States and all around the world.

On the SRE team, you'll have the opportunity to sharpen your expertise in coding, performance analysis, and large-scale systems operation. Join us and you'll have the chance to shape the future of AML systems and make a real, tangible impact on TikTok users.

Responsibilities:

  • Design, build, and maintain highly available, scalable, and fault-tolerant systems.

  • Monitor and analyze system performance, identifying and resolving issues before causing user impact.

  • Develop and maintain automated monitoring, alerting, and incident response systems.

  • Collaborate closely with software engineering teams to ensure that applications are designed with reliability, scalability, and performance in mind.

  • Implement and maintain security best practices and ensure compliance with regulatory requirements.

  • Participate in on-call rotations and respond to issues and incidents within and outside of normal business hours.

  • Conduct root cause analysis of incidents, hold post-mortem reviews with stakeholders, and implement preventative measures to minimize the risk of similar incidents occurring in the future.

Qualifications
Minimum Qualifications

  • Expertise in analyzing and troubleshooting Linux-based distributed systems.

  • Bachelor's/Master's degree in Computer Science, Computer Engineering, or equivalent years of experience in a SRE or software engineering role.

  • Experience programming with at least one commonly used language (C, C++, Python, Go).

  • Strong understanding of data structures and algorithms.

  • Competent knowledge of relational database systems.

Preferred Qualifications

  • Ability to design and maintain large-scale systems.

  • Strong understanding of code optimization and routine task automation.

  • Proficiency in at least one machine learning framework: TensorFlow, PyTorch, MXNet or PaddlePaddle

About USDS
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security ("USDS") is a subsidiary of TikTok in the U.S. This new, security-first division was created to bring heightened focus and governance to our data protection policies and content assurance protocols to keep U.S. users safe. Our focus is on providing oversight and protection of the TikTok platform and U.S. user data, so millions of Americans can continue turning to TikTok to learn something new, earn a living, express themselves creatively, or be entertained. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.​

On-site presence across teams allows the company to operate with greater speed, alignment, and agility — especially in areas like real-time decision-making, team development, and integrated execution. As such, the company is shifting from a hybrid work model to a fully in-person schedule up to 5 days a week.​

Why Join Us
Inspiring creativity is at the core of TikTok's mission. Our innovative product is built to help people authentically express themselves, discover and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and bring joy - a mission we work towards every day.​

We strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. Every challenge is an opportunity to learn and innovate as one team. We're resilient and embrace challenges as they come. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our company, and our users. When we create and grow together, the possibilities are limitless. Join us.​

Diversity & Inclusion​

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.​

Acknowledgment of Country
In the spirit of reconciliation, TikTok acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today.​



  • Sydney, New South Wales, Australia Luminance Technologies Full time

    The RoleLuminance's Site Reliability team combines strong problem solving, infrastructure tooling and wider DevOps practices to provide a service of Luminance's unique software applications. The team plays a crucial role in incident response and issue resolution, swiftly addressing and resolving service interruptions to maintain the highest level of customer...


  • Sydney, New South Wales, Australia Luminance Full time

    The Role Luminance's Site Reliability team combines strong problem solving, infrastructure tooling and wider DevOps practices to provide a service of Luminance's unique software applications. The team plays a crucial role in incident response and issue resolution, swiftly addressing and resolving service interruptions to maintain the highest level of...


  • Sydney, New South Wales, Australia NSW Government Full time

    Site Reliability EngineerSNSW Grade 7/8 - base salary from $105,986 to $124,957Locations: Haymarket, Parramatta or Gosford (Hybrid - 2 days + per week in the office)Permanent Ongoing opportunityChampion reliability. Drive impact. Lead with purpose. At Service NSW Digital, we're not just building software - we're building trust, resilience, and connection...


  • Sydney, New South Wales, Australia NSW Department of Customer Service Full time

    Site Reliability EngineerSNSW Grade 7/8 - base salary from $105,986 to $124,957Locations: Haymarket, Parramatta or Gosford (Hybrid - 2 days + per week in the office)Permanent Ongoing opportunityChampion reliability. Drive impact. Lead with purpose.At Service NSW Digital, we're not just building software - we're building trust, resilience, and connection for...


  • Sydney, New South Wales, Australia N2S Full time

    OverviewA Site Reliability Engineer ensures the reliability, scalability, and performance of systems and services. They bridge the gap between development and operations by applying software engineering principles to infrastructure and operations problems.Key ResponsibilitiesSystem Reliability & Performance Design, build, and maintain scalable and highly...


  • Sydney, New South Wales, Australia Preacta Recruitment Full time

    Great newly created opportunity for a Principal Site Reliability Engineer to join a high performing Reliability Technology team within a large scale digital banking environment. Software engineering sits at the core of how this organisation operates, and engineers are empowered to make strong technical decisions, continuously improve systems and deepen...


  • Sydney, New South Wales, Australia CareCone Group Full time

    Role: Site Reliability Engineer (Elastic-Search+ Dynatrace)Location: SydneyFulltime (Permanent)Job Description:Key Responsibilities:Design, implement, and optimize relational and non-relational databases for performance, scalability, and reliability.Lead and support database migration activities, including planning, execution, and post-migration...


  • Sydney, New South Wales, Australia Tyro Payments Full time

    Why Tyro?At Tyro, we're into business big time. Through our integrated payments, banking and lending solutions, we're here to ensure nothing stands in the way of Australian business success. With over 21 years' experience under our belt, we know what it takes to build something great, which is why we combine the best people, technology, and partners to...


  • Sydney, New South Wales, Australia Quantium Full time

    Job Type: Permanent - Full TimeLocation: SydneyJob Category: EngineeringQuantium is a global leader in data science and artificial intelligence, at the forefront of the AI revolution transforming business worldwide. Since 2002, we have grown into an international team of over 1,200 AI and data specialists across 14 locations, delivering cutting-edge AI...


  • Sydney, New South Wales, Australia Google Full time

    Minimum qualifications:Bachelor's degree in Computer Science, a related field, or equivalent practical experience.2 years of experience with software development in one or more programming languages.Preferred qualifications:Master's degree in Computer Science or Engineering.2 years of experience in designing, analyzing, and troubleshooting distributed...