
Site Reliability Engineer
5 days ago
Our TechOps Engineers are the frontline team keeping our large fleet of cloud-hosted Apache Kafka, Cassandra, OpenSearch, Cadence, Valkey, Clickhouse and PostgreSQL clusters up and running. Every day you will diagnose and solve challenging and interesting technical problems providing a service that is relied on by some of the leading global names in tech to deliver for millions of end users.
This role is for an Australia-based Senior TechOps engineer – primarily focusing on Cadence Opensource technology – that includes operating, maintaining, upgrading and continuously improving the Managed Service for Cadence (across AWS, Azure and GCP) to deliver a great customer experience.
Job Requirements- Working with our Managed Service Product development team to establish Cadence operational requirements and support procedures.
- Responding to customer queries and incidents, diagnosing and solving complex technical issues by liaising with customer's engineers. This will include written communication via support tickets and occasional video-call based support.
- The role provides an opportunity to additionally work extensively on Apache Cassandra, Kafka, Opensearch, PostgreSQL, along with Cloud providers such as AWS, GCP and Azure.
- Assist/mentor Level-1 team members to develop their technical capabilities on Cadence.
- Undertake complex cluster operations such as migrations, upgrades and maintenance.
- Provide expert operational support to our nodes running in the cloud (AWS, Azure and GCP) as well as On-premise, using technologies such as Linux (Debian), Docker, and languages including Java, Python and Bash.
- Investigate issues and apply standard maintenance procedures to optimize the performance and stability of production systems.
- Liaise with the Development and Product Management team through all stages of the development cycle to ensure proper release processes/procedures are being followed.
- Develop and continually improve our suite of internal automation tools, applications, and processes.
- Be a proactive, reliable and supportive member of the TechOps team, and participate in a rotating L2 shift roster.
We're looking for smart engineers with exceptional communication skills, a positive attitude, and a passion for IT and learning new things. We expect you to be, or quickly become proficient in the range of technologies we use.
- You must have at least 3-5 years of working experience in addition to:
- Managing Production environment, including performance benchmarking and tuning on application and kernel level.
- Strong Linux skills with experience in cloud environments is a must, preferably AWS or GCP or Azure. Should be comfortable working from the command line. This is essential, there are no GUIs here.
- Familiarity with installing and maintaining VMs and applications in scale, including upgrade, migration and life cycle management.
- Ability to debug applications using logs and metrics, and replicate issues in local environment.
- Preferably experience with Ansible, Prometheus, Terraform, Grafana and Docker.
- Good fundamental computer science / software engineering skills and knowledge, particularly operating system internals, memory management, and networking.
- Ideally, programming skills in languages such as GO, Python, Java, Bash scripting, SQL and source code control using Git.
- Exceptional ability to communicate clearly and professionally in written and verbal English (essential).
- Follow required processes and procedures.
- Work as part of a team and use your initiative to get things done.
- Passion for all things IT, and especially open source.
- Any customer service experience is favorable.
-
Site Reliability Engineers
4 weeks ago
Canberra, ACT, Australia Xero Full timeOur Purpose At Xero, we're here to help you supercharge your business.We do this by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, advisors and apps.When that happens, we're not only making life better for small business, we'll be building a stronger economy that can change the world.About the team In...
-
Engineer Ii- Site Reliability
1 week ago
Canberra, ACT, Australia Crowdstrike Full timeEngineer II- Site Reliability (Remote, AUS) CrowdStrike Canberra, Australian Capital Territory, Australia Join or sign in to find your next job Join to apply for the Engineer II- Site Reliability (Remote, AUS) role at CrowdStrike Engineer II- Site Reliability (Remote, AUS) CrowdStrike Canberra, Australian Capital Territory, Australia 6 days ago Be among the...
-
Site Reliability
1 week ago
Canberra, ACT, Australia Canonical Full timeCanonical Canberra, Australian Capital Territory, Australia Join or sign in to find your next job Join to apply for the Site Reliability / Gitops Engineer role at Canonical Canonical Canberra, Australian Capital Territory, Australia 1 day ago Be among the first 25 applicants Join to apply for the Site Reliability / Gitops Engineer role at Canonical ...
-
Engineer II- Site Reliability
4 weeks ago
Canberra, ACT, Australia CrowdStrike Full timeEngineer II- Site Reliability (Remote, AUS)CrowdStrike Canberra, Australian Capital Territory, AustraliaJoin or sign in to find your next jobJoin to apply for the Engineer II- Site Reliability (Remote, AUS) role at CrowdStrikeEngineer II- Site Reliability (Remote, AUS)CrowdStrike Canberra, Australian Capital Territory, Australia6 days ago Be among the first...
-
Site Reliability
1 week ago
Canberra, ACT, Australia Canonical Full timeCanonical Canberra, Australian Capital Territory, AustraliaJoin or sign in to find your next jobJoin to apply for the Site Reliability / Gitops Engineer role at CanonicalCanonical Canberra, Australian Capital Territory, Australia1 day ago Be among the first 25 applicantsJoin to apply for the Site Reliability / Gitops Engineer role at CanonicalCanonical is a...
-
Lead Site Reliability Engineer
4 weeks ago
Canberra, ACT, Australia Xero Full timeOur PurposeAt Xero, we're here to help you supercharge your business. We do this by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, advisors and apps. When that happens, we're not only making life better for small business, we'll be building a stronger economy that can change the world.About the...
-
Site Reliability Engineer
1 week ago
Canberra, ACT, Australia Canonical Full timeCanonical Canberra, Australian Capital Territory, Australia Overview Join to apply for the Site Reliability Engineer role at Canonical .Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud,...
-
Site Reliability Engineering Lead
1 day ago
Canberra, ACT, Australia beBeeSoftware Full time $147,093 - $164,455Job OpportunityAbout the PositionWe are seeking a senior engineer to lead our site reliability engineering team in implementing DevOps best practices and driving internal projects. The ideal candidate will have experience with software delivery using infrastructure as code, managing DevOps teams, and understanding complex distributed...
-
Site Reliability Engineer
1 week ago
Canberra, ACT, Australia Canonical Full timeCanonical Canberra, Australian Capital Territory, AustraliaOverviewJoin to apply for the Site Reliability Engineer role at Canonical. Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud,...
-
Site Reliability Engineer
5 days ago
Canberra, ACT, Australia Canonical Full timeCanonical Canberra, Australian Capital Territory, AustraliaOverviewJoin to apply for theSite Reliability Engineerrole atCanonical.Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data...