Senior Site Reliability Engineer
10 hours ago
Join the Food Tech Revolution at EatClub
About Us
EatClub is a fast-growing tech company with big global ambitions, co-founded by legendary chef Marco Pierre White and industry leaders. We're on a mission to revolutionise the hospitality industry, helping restaurants boost profitability through smart, dynamic pricing.
We power thousands of venues across Australia and have recently launched in the UK as we expand our international footprint. Our platform connects over 2 million customers with thousands of top restaurants, offering real-time deals that can save diners up to 50% off their bill.
Now is an exciting time to join our team, are you ready to be part of something different?
Read About Us In The AFR
Read More About Us In Broadsheet
Why You'll Love Working Here
- Everyone contributes: We encourage every team member to bring their ideas to the table and help shape what we build next.
- Create Big impact: Your code will directly affect how thousands of venues and millions of users connect over food and drink.
- Startup speed, real ownership: You'll work in a fast-paced, agile environment where your ideas matter and shipping code is the focus.
- Remote + flexible: We care about outcomes, not clock-watching. Work from wherever suits you best.
- Inclusive + diverse: We embrace differences and believe the best teams are built on diversity in background, thinking, and experience.
- Surround yourself with exceptional talent: We seek out top talent. Our people are passionate about their craft, and love inspiring those around them to be their best.
- Supportive, fun-loving team: We work hard together, cheer each other on, and celebrate our wins as a team.
The Role
We're looking for a proactive, product-focused Senior SRE Engineer who's excited to take on multiple responsibilities and build and support the mission-critical systems that bring joy to users and value to restaurants.
You'll join a team where you can have real influence, wide scope, and room to grow in a fun and rewarding environment. Responsibilities include collaborating with cross-functional teams, designing and building advanced web frontend features, improving application performance, and continuously improving application and code quality.
Key Responsibilities
- Reliability Engineering – Support capacity planning, define and manage service-level objectives (SLOs) and error budgets, and lead incident response efforts.
- Platform Support – Responsibility for the availability and performance of the EatClub AWS platform. Build automation to prevent recurrence of issues and ensure all
non-exceptional service conditions are handled automatically.
- Cross-functional Collaboration – Partner with software engineers, product managers, and other stakeholders to embed reliability, scalability, and resilience into the software
delivery lifecycle.
- Incident Management – Coordinate and communicate during critical production events, supporting response efforts and ensuring rapid resolution.
- CI/CD and Testing Support – Work closely with engineering teams to support and improve CI/CD pipelines and automated testing frameworks.
- Disaster Readiness – Build systems that are designed and tested for fault tolerance, redundancy, and recovery.
- Observability – Build monitoring, logging, and alerting systems based on best practices to improve visibility into system health and performance. Automation Leadership –
Promote and implement automation across operational processes to reduce toil and increase efficiency.
- Security Posture – Contribute to improving the security of infrastructure and processes through proactive hardening and secure practices.
What You'll Bring
- Proactive, improvement-focused mindset with a passion for building reliable systems
- A can-do mindset, ready to roll up your sleeves
- Proven experience in DevOps, Site Reliability Engineering, platform operations, or a similar discipline.
- AWS Cloud infrastructure expertise – Experienced in building infrastructure in AWS, including services like EC2, S3, IAM, CloudWatch, etc. Ideally across multiple
geographies.
- Infrastructure as Code (IaC) – Strong proficiency with tools like Terraform or AWS CDK.
- CI/CD pipelines – Building and maintaining robust deployment pipelines using GitHub Actions
- Observability – Experience designing and managing logging, monitoring, and alerting stacks (e.g., Prometheus, Grafana, Datadog, ELK, OpenTelemetry).
- Scripting & Automation – Proficient in Python or Bash for automating operational tasks and building internal tools.
- Security & compliance awareness – Familiarity with secure infrastructure practices, secrets management, vulnerability scanning, and audits. Understanding of compliance
standards (SOC2, ISO 27001, etc.)
- Skilled in IP networking concepts (DNS, load balancing, etc.)
- Experience with Linux systems administration – Experience with Linux, system internals, and performance tuning.
Bonus Points If You...
- Have experience in Database Administration
- Are experienced in Containerisation & Orchestration – Expertise in Docker and Kubernetes (EKS/GKE/AKS), including deployment, monitoring, and troubleshooting.
- Have experience with Chaos engineering / fault injection – Experience building resilient systems and running game days or incident simulations.
Qualifications
- Degree in Computer Science or a related discipline
- A minimum of 5 years of post-degree commercial experience in DevOps and AWS, in high-scale, high-availability environments
- Full working rights in Australia
Hungry Yet?
If you're looking for a role where you can do your best work, make a visible impact, grow your career, and work with great humans, then we'd love to hear from you. Apply now, and let's build something extraordinary - one dish, one booking, one feature at a time.
-
Senior Site Reliability Engineer
6 days ago
Sydney, New South Wales, Australia tekFinder Full time $220,000 per yearSenior Site Reliability Engineer - Software Engineering Background - Java, Golang - circa $220k package + bonus.We have multiple roles within a global platform team of 50, working to build tools, influence design decisions, and drive best practices for reliability.These roles differ slightly from a typical SRE, where they serve as first-line responders or...
-
Senior Site Reliability Engineer
11 hours ago
Sydney, New South Wales, Australia Aurec Full time $150,000 - $170,000 per yearSenior Site Reliability Engineer - Operations Location: Onsite / Hybrid - Maquarie ParkSalary - $150-170K + Super + BonusA leading organisation in the digital communications and services sector is seeking an Operations Lead to join a high-performing Rapid Response Team. This team works at the front line of major incidents and high-pressure production...
-
Site Reliability Engineer
1 week ago
Sydney, New South Wales, Australia AI Hustler Full time $120,000 - $180,000 per yearSite Reliability Engineer (SRE) | Daily Rate Contract | Visa Sponsorship AvailableLocation:Sydney (Hybrid or Remote)Type:Contract (Daily Rate)Experience Level:Mid to Senior (5+ years)Stack:Kubernetes, Terraform, CI/CD, Observability, Cloud (AWS/GCP/Azure)Our client is looking for an experienced Site Reliability Engineerto join a high-scale platform team...
-
Senior Site Reliability Engineer
3 days ago
Sydney, New South Wales, Australia IAG Full time $120,000 - $180,000 per yearCreate impact as aSenior Site Reliability Support Engineerwith Splunk & New Relic expertise.NRMA Insurance has been helping Australians with their general insurance and actively supporting communities for 100 years. Part of Insurance Australia Group (known as IAG), we're proud to be one of Australia's most iconic brands.Your RoleWe are currently looking for...
-
Staff Site Reliability Engineer
2 weeks ago
Sydney, New South Wales, Australia Commonwealth Bank Full time $120,000 - $180,000 per yearYou are passionate about SRE and systems engineeringWe are undergoing one of Australia's largest digital transformationsTogether we can reimagine banking for millions of customersDo work that mattersWe're accelerating our digital strategy with an ambition to provide customers with one of the best digital experiences of any company globally. Site Reliability...
-
Site Reliability Engineer
2 weeks ago
Sydney, New South Wales, Australia Luminance Full time $120,000 - $180,000 per yearThe RoleLuminance's Site Reliability team combines strong problem solving, infrastructure tooling and wider DevOps practices to provide a service of Luminance's unique software applications. The team plays a crucial role in incident response and issue resolution, swiftly addressing and resolving service interruptions to maintain the highest level of customer...
-
Site Reliability Engineer
2 weeks ago
Sydney, New South Wales, Australia Luminance Full time $80,000 - $140,000 per yearThe Role Luminance's Site Reliability team combines strong problem solving, infrastructure tooling and wider DevOps practices to provide a service of Luminance's unique software applications. The team plays a crucial role in incident response and issue resolution, swiftly addressing and resolving service interruptions to maintain the highest level of...
-
Site Reliability Engineer
2 weeks ago
Sydney, New South Wales, Australia Luminance Full time $120,000 - $180,000 per yearThe Role Luminance's Site Reliability team combines strong problem solving, infrastructure tooling and wider DevOps practices to provide a service of Luminance's unique software applications. The team plays a crucial role in incident response and issue resolution, swiftly addressing and resolving service interruptions to maintain the highest level of...
-
Site Reliability Engineer
13 hours ago
Sydney, New South Wales, Australia Cover Genius Full time $120,000 - $180,000 per yearAbout The CompanyCover Genius is a Series E Insurtech that protects the global customers of the world's largest digital companies including Booking Holdings, owner of Priceline, Kayak and , Intuit, Hopper, Skyscanner, Ryanair, Turkish Airlines, Descartes ShipRush, Zip and SeatGeek. We're also available at Amazon, Flipkart, eBay, Wayfair and SE Asia's largest...
-
Site Reliability Engineer
2 weeks ago
Sydney, New South Wales, Australia Luminance Full time $120,000 - $180,000 per yearThe RoleLuminance's Site Reliability team combines strong problem solving, infrastructure tooling and wider DevOps practices to provide a service of Luminance's unique software applications. The team plays a crucial role in incident response and issue resolution, swiftly addressing and resolving service interruptions to maintain the highest level of customer...