Senior Site Reliability Engineer

7 hours ago


Australia Aerospike Full time $120,000 - $180,000 per year

Aerospike is the real-time database for mission-critical use cases and workloads, including machine learning, generative, and agentic AI. Aerospike powers millions of transactions per second with millisecond latency, at a fraction of the total cost of ownership compared to other databases.

Global leaders, including Adobe, Airtel, Barclays, Criteo, DBS Bank, Experian, Grab, HDFC Bank, PayPal, Sony Interactive Entertainment, The Trade Desk, and Wayfair, rely on Aerospike for customer 360, fraud detection, real-time bidding, profile stores, recommendation engines, and other use cases. 

 At Aerospike, we dream big and deliver even bigger. Our mission is to unleash the power of the world's real-time data with a database built for infinite scale, speed, and sustainability.

If you're ready to shape the future of data, join us.

Senior Site Reliability Engineer

As a Senior Site Reliability Engineer (SRE) for Aerospike, you will be instrumental in designing, building, and optimizing a scalable, highly resilient cloud platform. You will focus on improving reliability, performance, and automation to ensure seamless delivery and operation of our cloud platform services. Your responsibilities will include developing robust infrastructure, implementing intelligent monitoring systems, and driving continuous improvement initiatives that enhance system efficiency, scalability, and overall platform stability.

Key Responsibilities

  • Designing, deploying, and optimizing large-scale Aerospike cloud platform infrastructure and services across multiple environments
  • Leading the development and enhancement of automation and infrastructure-as-code solutions to improve operational efficiency
  • Building and maintaining monitoring, alerting, and observability implementations to proactively detect and resolve system issues
  • Leading incident response activities, conducting post-mortems, and driving continuous improvement initiatives
  • Designing and enforcing security best practices for cloud infrastructure and access control
  • Collaborating with development teams to ensure reliable service delivery and alignment with SRE best practices
  • Participating in on-call rotation, responding to critical incidents and minimizing downtime through proactive mitigation strategies
  • Establishing documentation standards, runbooks, and system configurations for team knowledge sharing
  • Leading capacity planning and performance optimization efforts
  • Mentoring junior engineers and sharing knowledge to build team capabilities
Required Experience
  • 6+ years of experience in Site Reliability Engineering (SRE), DevOps, or related fields, with a focus on building scalable, resilient, and automated cloud-based systems
  • Hands-on experience designing, deploying, and optimizing production-grade, business-critical systems in cloud environments
  • Expertise with at least one major public cloud provider (AWS, Google Cloud, or Azure), including cloud-native services and architectures
  • Strong proficiency in infrastructure-as-code (IaC) tools such as Terraform to enable automated and reproducible infrastructure
  • Experience in CI/CD pipeline design and implementation, enabling seamless, automated software delivery and infrastructure updates
  • Deep understanding of Linux/Unix systems, networking fundamentals, and distributed system architectures
  • Proficiency in scripting and software development using Python, Bash, or Go to build automation, tooling, and infrastructure enhancements
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes for efficient service deployment and scaling
  • Hands-on experience with monitoring, logging, and observability tools (e.g., Prometheus, Grafana, Datadog, Elasticsearch, Kibana) to drive data-driven system improvements
  • Strong problem-solving skills with an engineering-first mindset for improving system reliability, scalability, and performance
  • Experience implementing security best practices for cloud infrastructure, access control, and data protection
  • Excellent English communication skills (verbal and written) to collaborate effectively across teams and document key processes

Preferred Skills and Qualifications

  • Hands-on experience managing and optimizing database deployments and services in production environments, ensuring high availability and performance
  • Familiarity with Aerospike or other distributed NoSQL databases
  • Advanced understanding of security practices and implementation in cloud environments
  • Relevant industry certifications, such as AWS Certified DevOps Engineer, AWS Certified Solutions Architect, Google Professional Cloud DevOps Engineer, or equivalent
  • Kubernetes certifications such as Certified Kubernetes Administrator (CKA), Certified Kubernetes Application Developer (CKAD), or Certified Kubernetes Security Specialist (CKS)
  • Proficiency with configuration management tools (Ansible, Terraform, or similar) in complex environments
  • Experience leading collaborative development practices and advanced version control workflows

Aerospike is an Equal Opportunity Employer. We are committed to providing an environment free from discrimination on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status, or any other basis covered by appropriate law.



  • Australia Aussie Broadband Full time $80,000 - $120,000 per year

    Aussie Broadband's (ABB) purpose is to the change the game.As our Site Reliability Engineer, you'll support this by ensuring the availability, reliability and performance of our systems and infrastructure.At Aussie Broadband we believe difference is something to celebrate. Being advocates for Inclusion and Diversity means our team can bring their whole...


  • Greenway ACT , Australia ActewAGL Distribution Full time $175,000 - $200,000 per year

    Senior Asset Reliability EngineerOngoing, Full-Time opportunitySalary starting at $174,862 plus 16% superannuation Location: Greenway, Canberra, ACT (Free parking onsite)About the roleAs the Senior Asset Reliability Engineer, you will lead the change in enhancing the long-term reliability, safety, and performance of Evoenergy's electricity network. This role...


  • Edinburgh, South Australia Swordfish Computing Full time $80,000 - $120,000 per year

    At Swordfish, we specialise in delivering transformative innovation to our Defence clients through integrated teams that combine the engineering disciplines with deep defence domain knowledge, specialist skills in the applied sciences, mathematics and digital technologies. We are passionate about applying quality engineering and embracing emerging...


  • Remote, Barcelona - Australia Flight Centre Careers Full time $120,000 - $180,000 per year

    Kia Ora, Hola, สวัสดี, Guten TagWhereTo is a business travel startup from San Francisco that evolved into an agile development and design studio within the Flight Centre family. We build travel solutions used by some of the largest companies on the planet - we have just one goal: making business travel better for everybody.WhereTo provides an...


  • Australia Oracle Full time $120,000 - $180,000 per year

    DescriptionWe are a world class team of high caliber security tool services Site Reliability Engineers. We are an inclusive and diverse team with a full spectrum of experience distributed globally. We have the resources of a large enterprise and the energy of a start-up, working on a critical greenfield software assurance project collaboratively with our...


  • Remote, Australia GitLab Full time $120,000 - $180,000 per year

    GitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating human progress. Our...


  • Australia ShiftCare Full time $120,000 - $180,000 per year

    Description About ShiftCare ShiftCare's innovative software is a market leader which helps disability support providers, in-home aged carers and allied health professionals worldwide streamline the way they work by creating efficiencies in rostering, client management and billing, enabling businesses to grow. About The Team You'll Be Joining The...


  • Australia - Remote Replicated Full time $170,000 - $210,000 per year

    Replicated is a Commercial Software Distribution Platform. Replicated helps software vendors distribute their applications into self-hosted environments like VPC, on-prem, air gap, and more. With a suite of tools ranging from installation, to testing, to licensing and support, Replicated is the best way to operationalize and scale the distribution of...


  • Toowoomba , Queensland, , Australia New Hope Group Full time $80,000 - $120,000 per year

    Permanent full-time position | Monday to Friday RosterDIDO residential role 45 mins from ToowoombaDrive equipment reliability and performanceThe New Acland Coal Mine (NAC), a thin seam open cut mining operation is owned and managed by New Hope Group. Based in Acland, a short drive north-west of Toowoomba NAC has played a key role in the Darling Downs region...


  • Remote, Australia; Remote, Canada; Remote, New Zealand GitLab Full time $80,000 - $120,000 per year

    GitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating human progress. Our...