Senior Customer Reliability Engineer
1 day ago
We're looking for curious and conscientious builders, innovators, and collaborators. At Replicated, we believe true impact comes from caring deeply about your work, thinking in terms of long-term relationships, and approaching every challenge with curiosity and a willingness to revisit first principles. If you're someone who thrives on continuous learning, seeks purpose in solving real (and sometimes complex) problems, as well as cares about building sustainable, thoughtful solutions, you'll find your home here.
Replicated is a Commercial Software Distribution Platform that tackles some of the hardest challenges in modern enterprise software. We help vendors deliver Kubernetes applications into highly controlled customer environments — from VPCs and on-prem data centers to fully air-gapped networks. Our platform includes Kubernetes-native installers, automated release pipelines, container registries and proxies for restricted networks, license management, telemetry, and integrated support tooling. Engineers at Replicated work on distributed systems, networking, and developer experience — ensuring that even the most complex applications can be deployed and operated reliably in security-sensitive environments.
We work with fast-growing enterprise software vendors such as KNIME, Puppet, Smartbear, BigID, and Swimlane, helping them bring their products into some of the world's most security-conscious enterprises.
We are fully remote and plan to stay that way We do have a small office in Austin, TX where our local teammates enjoy spending time working together. We're open to any state in the US.
In addition, for some roles, we're open to candidates in Canada, the UK, Australia, and New Zealand (we will specify on postings for these).
The Customer Reliability Engineering (CRE) team, a group of dedicated global engineers focused on helping our vendors successfully deliver and support Kubernetes applications in customer-managed environments. As a CRE, you'll be on the front lines, working directly with customers to solve complex technical challenges related to application deployment, management, and troubleshooting. You'll gain deep expertise in Kubernetes, the Replicated product suite, and the intricacies of customer-managed deployments, including scenarios where cluster installation is required. This role prioritizes exceptional support and customer success, collaborating closely with Sales and Product Engineers.
This role is perfect for you if you are passionate about problem-solving, enjoy helping people, and thrive on diving deep into technical challenges. You'll leverage your operational knowledge to build best practices and contribute to tooling that empowers both our internal teams and our vendors. This is an excellent opportunity to extend a strong foundation in Kubernetes, Linux, and the broader cloud-native ecosystem, while learning from experienced engineers on a successful, growing team.
What you'll be doing:Provide expert support to customers, resolving issues related to Kubernetes, Linux, and Replicated products. This includes troubleshooting failures, identifying root causes, and implementing solutions. Every day will present new and unique challenges.
Enable Customer Success: Work proactively with customers to ensure they are successfully deploying, managing, and scaling their applications using Replicated. This includes providing guidance, best practices, training, and assisting with onboarding new applications.
Collaborate with Engineering: Proactively work closely with CREs and product engineers to share customer feedback, identify product improvements, and contribute to the overall Replicated product roadmap. While this role doesn't require implementing code changes on day one, you'll be a key contributor in identifying areas for improvement, and the team regularly makes code contributions to enhance our products and tools. As you grow within the team, you'll have opportunities to develop your coding skills and contribute directly to these improvements.
Continuous Learning: Invest in your personal and professional growth. Replicated is committed to supporting your development through courses, certifications, and other learning opportunities.
Preferably 3 or more years of professional experience in the following areas:
Experience with Linux system administration. You have the knowledge and ability to troubleshoot complex system and network issues, at an advanced level, as well as clearly explain the findings to customers.
Experience with Kubernetes and Helm. You have the knowledge and ability to diagnose complex issues with Kubernetes on bare metal, develop and troubleshoot advanced Helm charts, and guide customers in designing scalable deployment strategies.
Exceptional technical and non-technical communication and interpersonal skills. You must be able to clearly explain complex technical concepts to both technical and non-technical audiences in English.
Strong problem-solving skills, the ability to think critically, and act quickly under pressure.
A customer-centric mindset and a genuine desire to help others succeed.
Experience working remotely with teams across various time zones.
Experience with CNCF tools
Familiarity with Go and the ability to debug Go programs
Customer facing experience
Note: This role does include some on-call support coverage. While we do our best to optimize for timezones and working hours, our global team is expanding to ensure we are available for our customers when they need us.
Preferred Remote Location: Australia or New Zealand
Please note: New Zealand and Australia, applicants must have the legal right to work in their country
Your Growth Journey at ReplicatedIn your first 30 Days:
Immerse Yourself: Dedicate yourself to learning about Replicated - the company, the global CRE team, our products, and our customers (vendors).
Hands-on Training: Complete comprehensive hands-on training with the Replicated platform, working through a structured onboarding checklist.
Team Connections: Meet with team members across Replicated, including senior CREs, product engineers, and other departments, to build relationships and understand different perspectives.
Onboarding Improvement: As you go through the onboarding process, actively identify areas for improvement and suggest changes to make it even better for future CREs.
Active Support Participation: Begin working on real support cases from the queue, with direct oversight and guidance from senior CREs. This hands-on approach will accelerate your learning and understanding of customer issues and troubleshooting techniques.
In your first 60 days:
Deeper Support Immersion: Continue working on support cases, increasing the complexity and variety of issues you handle. Focus on understanding the "why" behind customer problems and the solutions implemented.
Process Improvement: Proactively suggest improvements to the support process, both technical (e.g., tooling, diagnostics) and procedural (e.g., communication workflows, escalation paths).
Product Knowledge Expansion: Deepen your understanding of how Replicated's products are developed, how different services interact, and how they are used in customer-managed environments.
Vendor Interaction: Begin to participate in some supervised customer interactions, gradually taking on more responsibility under the guidance of senior CREs.
Documentation Review: Review existing support documentation and training materials, identifying areas for updates or improvements.
In your first 90 days:
Independent Support: Take on full responsibility for handling support issues from the queue, working independently to diagnose, resolve, and prevent recurrence.
On-Call Rotation: Join the on-call rotation, providing 24/7 support coverage (primarily weekends due to the global team) for specific Replicated products. Remember, you're never alone - the team is always available to support you.
Customer Success Engagement: Begin actively participating in proactive customer success activities, such as assisting with onboarding new applications or providing best-practice guidance.
Feedback Loop: Become a key contributor to the feedback loop between customers and engineering, sharing insights and identifying areas for product improvement.
Continued Learning: Continue to invest in your personal and professional growth, leveraging Replicated's resources (like the curiosity budget) to expand your skills in Kubernetes, Linux, and other relevant technologies. Begin exploring opportunities to develop your Go coding skills.
At Replicated, we value our teammates as individuals who are stronger together. We offer a robust pay and benefits package that rewards employees for their contributions to our success, supports their well-being, and helps all of us create a great remote work environment.
For team members outside of the US, our salary ranges are at localized rates for the countries we support. This is dependent on several factors, including level, qualifications, and experience. We also offer stock options, as well as a unique home office allowance & a professional development budget. An overview is on our careers page here:
In Australia (local currency) the salary range for this role is as follows:
Software Engineer II: $170,000 - $210,00
Senior Software Engineer I: $187,000 - $235,000
In New Zealand (local currency) the salary range for this role is as follows:
Software Engineer II: $140,250 - $175,00
Senior Software Engineer I: $148,750 - $187,000
We invest in our team and love candidates who are eager to learn and grow. We have a fantastic team of highly collaborative individuals who enjoy learning, growing, and mentoring others.
We invest in our team and love candidates who are eager to learn and grow. We have a fantastic team of highly collaborative individuals who enjoy learning, growing, and mentoring others.
OUR CORE VALUES
Care Deeply: Care deeply about the work that you do. Because of that you are constantly learning and willing to go out on a limb, challenge assumptions, go back to first principles, etc.
Longterm: Treat every interaction as part of a 30 year relationship, you'll see everyone down the road again as customers, partners, coworkers, etc.
Curious: We're always learning and we approach everyone and every problem with curiosity. When needed we challenge assumptions, and go back to first principles.
BENEFITS
We offer strong benefits to help you stay healthy and productive. For the US, our benefits are listed below:
Health/Dental/Vision
Life/AD&D
LTD/STD
FSA
401K
Stock options
Partner perk programs
Generous time off, we expect you to take a minimum of 3 weeks of per year
Laptop+accessories you need to get set up
Generous home office set up allowance or co-working space allowance - up to $10,000 per year
Curiosity Budget to help you keep learning and growing
Replicated is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We encourage applicants of all backgrounds and we work to make sure that all team members have an equal opportunity to succeed.
Please note at this time we are unable to provide sponsorship to individuals in the United States.
We do not accept unsolicited assistance from any headhunters, recruitment firms or any other third party for any of our job openings. Any unsolicited resumes sent from anyone other than the candidate, in any format, to any person at Replicated, will be considered Replicated property. Replicated will NOT pay a fee for any placement resulting from the receipt of an unsolicited resume.
#LI-Remote
-
Senior Customer Reliability Engineer
7 days ago
Australia - Remote Replicated Full time $170,000 - $210,000 per yearReplicated is a Commercial Software Distribution Platform. Replicated helps software vendors distribute their applications into self-hosted environments like VPC, on-prem, air gap, and more. With a suite of tools ranging from installation, to testing, to licensing and support, Replicated is the best way to operationalize and scale the distribution of...
-
Remote, Australia GitLab Full time $120,000 - $180,000 per yearGitLab is an open-core software company that develops the most comprehensive AI-powered DevSecOps Platform, used by more than 100,000 organizations. Our mission is to enable everyone to contribute to and co-create the software that powers our world. When everyone can contribute, consumers become contributors, significantly accelerating human progress. Our...
-
Senior Site Reliability Engineer
7 days ago
Australia Aerospike Full time $120,000 - $180,000 per yearAerospike is the real-time database for mission-critical use cases and workloads, including machine learning, generative, and agentic AI. Aerospike powers millions of transactions per second with millisecond latency, at a fraction of the total cost of ownership compared to other databases.Global leaders, including Adobe, Airtel, Barclays, Criteo, DBS...
-
Senior Asset Reliability Engineer
7 days ago
Greenway ACT , Australia ActewAGL Distribution Full time $175,000 - $200,000 per yearSenior Asset Reliability EngineerOngoing, Full-Time opportunitySalary starting at $174,862 plus 16% superannuation Location: Greenway, Canberra, ACT (Free parking onsite)About the roleAs the Senior Asset Reliability Engineer, you will lead the change in enhancing the long-term reliability, safety, and performance of Evoenergy's electricity network. This role...
-
Reliability Engineer
2 weeks ago
Australia ShiftCare Full time $120,000 - $180,000 per yearDescription About ShiftCare ShiftCare's innovative software is a market leader which helps disability support providers, in-home aged carers and allied health professionals worldwide streamline the way they work by creating efficiencies in rostering, client management and billing, enabling businesses to grow. About The Team You'll Be Joining The...
-
Site Reliability Engineer
7 days ago
Remote, Barcelona - Australia Flight Centre Careers Full time $120,000 - $180,000 per yearKia Ora, Hola, สวัสดี, Guten TagWhereTo is a business travel startup from San Francisco that evolved into an agile development and design studio within the Flight Centre family. We build travel solutions used by some of the largest companies on the planet - we have just one goal: making business travel better for everybody.WhereTo provides an...
-
Senior Cloud Engineer
7 days ago
Australia (Remote) Fastly Full time $120,000 - $180,000 per yearFastly helps people stay better connected with the things they love. Fastly's edge cloud platform enables customers to create great digital experiences quickly, securely, and reliably by processing, serving, and securing our customers' applications as close to their end-users as possible — at the edge of the Internet. The platform is designed to take...
-
Senior Software Engineer
7 days ago
Remote, Australia Marigold Full time $120,000 - $180,000 per yearThe Company: Marigold is a leading cross-channel marketing platform, with solutions for email, SMS, loyalty, and personalization, helping brands transform their relationships with customers. As the trusted partner behind the world's most recognized brands, including Air New Zealand, Nike, Wall Street Journal, Park Run, KFC and Kraft Heinz - Marigold...
-
Database Reliability Engineer
7 days ago
Australia - Remote, VC CrowdStrike Full time €90,000 - €120,000 per yearAs a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn't changed — we're here to stop breaches, and we've redefined modern security with the world's most advanced AI-native platform. We work on large scale distributed systems, processing almost 3...
-
Senior Software Engineer
7 days ago
Australia (remote) ClickHouse Full time $120,000 - $180,000 per yearAbout the Team The Cloud Infrastructure Engineering team builds and manages the foundational blocks of ClickHouse Cloud data plane end-to-end. This includes compute, networking, security, and a multi-cloud, multi-region architecture that provides a reliable and scalable managed ClickHouse experience for ClickHouse Cloud customers. Our team is looking for...