Site Reliability Engineer

5 months ago


Sydney, Australia Palantir Technologies Full time
A World-Changing Company Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more. The Role We’re looking for a Site Reliability Engineer who can help our Database Operations team scale, maintain, operate, and modernize the databases behind Palantir’s products. Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job.We strongly believe in engineering teams being responsible for operating their services in production. In this role, you’ll work closely with engineers to design sensible, scalable systems and to diagnose, resolve, and prevent production issues.Site Reliability Engineering exposes you to a broad range of products and business use cases, and builds operational and systems skills that are useful across the industry and in most engineering roles. Most of the work has implications for the entire fleet of Palantir environments, and therefore there is ample opportunity to improve the performance, stability, and costs at a very large scale. Technologies We Use Database Operations provides the majority of the support for Cassandra, Elasticsearch, and Kafka, along with their orchestrating services to ensure they operate as intended within Kubernetes, across a variety of clouds and on-premise, with varying degrees of access.

Core Responsibilities

Build expertise on pre-existing systems — their edge cases, failure modes, and life cycles - and how to improve the long-term reliability and scalability of Palantir’s services. Modify core services and infrastructure to improve stability and performance. Participate in operations, including on-call rotations during business hours and occasional weekends. Troubleshoot and debug availability and latency of Palantir’s databases and their clients. Modernize the fleet by migrating infrastructure, upgrading major versions, and right-sizing to optimize cost and performance.

What We Value

Confidence in troubleshooting complex issues independently using observability tools and stack traces. Ability to identify and remove toil. Comfortable with and curious about large scale production systems and technologies - for example, load balancing, monitoring, distributed systems, or configuration management. Ability to work with a high level of autonomy and responsibility in a rapidly changing environment with dynamic objectives and iteration with users. Demonstrated ability to develop improvements to services.

What We Require

Engineering background in Computer Science, Mathematics, Software Engineering, Physics or similar field. Familiarity with storage and data processing systems, cloud infrastructure, and other technical tools. Familiarity with monitoring systems using tools like Prometheus and writing health checks Strong written and verbal communication skills and ability to iterate quickly with teammates, incorporating feedback and holding a high bar for quality. Life at Palantir We want every Palantirian to achieve their best outcomes, that’s why we celebrate individuals’ strengths, skills, and interests, from your first interview to your longterm growth, rather than rely on traditional career ladders. Paying attention to the needs of our community enables us to optimize our opportunities to grow and helps ensure many pathways to success at Palantir. Promoting health and well-being across all areas of Palantirians’ lives is just one of the ways we’re investing in our community. Learn more at and note that our offerings may vary by region.In keeping consistent with Palantir’s values and culture, we believe employees are “better together” and in-person work affords the opportunity for more creative outcomes. Therefore, we encourage employees to work from our offices to foster connectivity and innovation. Many teams do offer hybrid options (WFH a day or two a week), allowing our employees to strike the right trade-off for their personal productivity. Based on business need, there are a few roles that allow for “Remote” work on an exceptional basis. If you are applying for one of these roles, you must work from the city and or country in which you are employed. If the posting is specified as Onsite, you are required to work from an office.Palantir is committed to promoting a culture of diversity, equity, and inclusion. We believe that all Palantirians share the responsibility of upholding our commitment to these values and encourage candidates from a wide range of backgrounds, perspectives, and lived experiences to join us in solving the world’s hardest problems.Palantir is committed to making the job application process accessible to everyone. If you are living with a disability (visible or not visible) and need to request a reasonable accommodation for any part of the application or hiring process, please and let us know how we can help.

  • Sydney, Australia Microsoft Full time

    OverviewAre you interested in working on one of Microsoft's most exciting products? Are you passionate about exceeding customer expectations and advancing Microsoft's cloud-first strategy? If so, the Azure Customer Experience (CXP) Customer Reliability Engineering (CRE) Team is the place for you!Azure CXP CRE is a top-level pillar of Azure Engineering that...


  • Sydney, Australia Microsoft Full time

    Overview Are you interested in working on one of Microsoft's most exciting products? Are you passionate about exceeding customer expectations and advancing Microsoft's cloud-first strategy? If so, the Azure Customer Experience (CXP) Customer Reliability Engineering (CRE) Team is the place for you! Azure CXP CRE is a top-level pillar of Azure...


  • Sydney, New South Wales, Australia VGW Full time

    Site Reliability Engineer SupervisorVGW is an interactive entertainment company that harnesses technology and creativity to deliver world-class, free-to-play games.We are seeking an experienced Site Reliability Engineer Supervisor to join our Engineering team in Sydney.This role will focus on ensuring the reliability of our systems as we bring new games to...


  • Sydney, New South Wales, Australia Immutable Full time

    About The RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Immutable. As a key member of our SRE team, you will play a crucial role in shaping our infrastructure, observability, and tooling patterns.You will be responsible for developing and releasing infrastructure as code, creating and maintaining multiple Kubernetes...


  • Sydney, Australia Lanson Partners Full time

    Build a culture of reliability and develop robust reliability pattern Software Engineering background Sydney-based, 2 days working from home As a Senior Site Reliability Engineer, you will be working closely with software engineering teams and stakeholders to ensure the health and performance of the infrastructure and software. You will play a key role in...


  • Sydney, Australia Talent International Full time

    A growing FinTech provider is seeking a Site Reliability Engineer to join their team on a permanent basis. Working in a small, close-knit team based in their office in North Sydney, you will be responsible for the support, maintenance and administration of their cloud platform (AWS) as well as application monitoring (Prometheus), ELK / Elastic Stack and...


  • Sydney, Australia Firesoft People Full time

    Senior Site Reliability Engineer Join a leading electronic trading firm in a pivotal role as a Site Reliability Engineer (SRE). At our firm, we are passionate about market-making and arbitrage opportunities on a global scale. Technology drives our success, fuelling our unified trading platform and enabling precise micro-decisions. With our agility and...


  • Sydney, New South Wales, Australia Firesoft People Full time

    Job DescriptionFiresoft People is seeking a Senior Site Reliability Engineer with strong AWS skills to join our FinTech lending specialist team on a full-time basis. This position offers an exceptional salary package of up to $170,000 per year, plus an annual $2,000 allowance for training and certifications.About the RoleWe are looking for an experienced SRE...


  • Sydney, New South Wales, Australia Google Full time

    About GoogleAt Google, we empower and support our employees to succeed by fostering a culture of diversity, equity, and inclusion. We believe that when everyone contributes, we can build better technology for everyone.We welcome Indigenous applicants and are committed to reconciliation through our technology, platforms, and people.Check out our...


  • Sydney, Australia Google Full time

    info_outlineXInfo At Google, we have a vision of empowerment and equitable opportunity for all Aboriginal and Torres Strait Islander peoples and commit to building reconciliation through Google’s technology, platforms and people and we welcome Indigenous applicants. Please see our Reconciliation Action Plan for more information.At Google, we have a vision...


  • Sydney, New South Wales, Australia IT Operations & Services Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Manager to join our IT Operations & Services team at The Star Entertainment Group.Job DescriptionThe successful candidate will be responsible for leading the delivery of property-based IT initiatives, including infrastructure relocations and gaming floor moves. You will work closely...


  • Sydney, Australia VGW Full time

    Site Reliability Engineer Supervisor VGW is an interactive entertainment company, harnessing technology and creativity to deliver world-class, free-to-play games.  We have an exciting opportunity to join our Engineering team in Sydney and are currently looking for an Engineering Supervisor to join the team. You'll focus on ensuring the reliability of our...


  • Sydney, Australia Zip Co Full time

    Senior Site Reliability Engineer  Experience working in SRE/Devops with Dynatrace and Kubernetes. Work on high impact SRE projects where you’ll own and drive initiatives end to end.  Hybrid, flexible working with two team connect days in the office per week.  Write your story with Zip Join Zip’s Technology function, responsible for building and...


  • Sydney, Australia Freelancer.com Full time

    Site Reliability Engineer Sydney, Australia Description About the Role:You will join a small team of versatile infrastructure engineers who are responsible for designing, building, and operating the mission-critical cloud platform powering , , and a number of other businesses within the enterprise. You will work with highly scalable FL/OSS services (Linux,...


  • Sydney, New South Wales, Australia Dimensional Fund Advisors Full time

    About Dimensional Fund AdvisorsWe are a forward-thinking organization leveraging cutting-edge technology to engineer scalable and innovative solutions that improve our clients' financial lives.Job DescriptionWe are seeking a seasoned Senior Site Reliability Engineer to join our team, responsible for managing our global investment data technology systems and...


  • Sydney, New South Wales, Australia Citadel Securities Full time

    SRE Role OverviewCitadel Securities is looking for a skilled Site Reliability Engineer to join our team. As a crucial member of our SRE team, you will be responsible for ensuring the reliability, availability, and performance of our financial systems. Your primary goal will be to design, implement, and maintain scalable, efficient, and highly available...


  • Sydney, New South Wales, Australia Firesoft People Full time

    Firesoft People is a leading electronic trading firm that relies heavily on technology to drive its success. As a Senior Site Reliability Engineer, you will be the face of technology, working directly with our trading desks to ensure seamless operations.The role requires a talented individual who can contribute across our entire technology platform,...


  • Sydney, Australia Atlassian Full time

    Working at AtlassianAtlassians can choose where they work – whether in an office, from home, or a combination of the two. That way, Atlassians have more control over supporting their family, personal goals, and other priorities. We can hire people in any country where we have a legal entity. Interviews and onboarding are conducted virtually, a part of...


  • Sydney, New South Wales, Australia Macquarie Full time

    At Macquarie, a global financial services group operating in 34 markets, we're seeking an experienced Senior Site Reliability Engineer to join our Engineering Enablers team.We're committed to providing the most reliable products and service in the financial industry. As part of our team, you'll contribute to the delivery of software reliability and help...


  • Sydney, Australia Firesoft People Full time

    Senior Site Reliability Engineer with strong AWS skills, sought to join a FinTech lending specialist on a full time basis. Up to $170K + Super!   Key Points: Exceptional salary package on offer + annual $2K allowance for training/certifications of your choice.Brand new pipeline of project work off the back of a recent partnership with one of the big 4...