Data Ingestion Sre, Data Platform

6 days ago

Sydney, New South Wales, Australia Tiktok Full time

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures.
As a site reliability engineer in the data platform area, you will have the opportunity to manage the services and infrastructures in one of the largest dataplaforms in the world that directly supports the TikTok app.
You'll need to ensure the data, services and infrastructures are reliable, fault-tolerant, efficiently scalable and cost-effective.
In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department.
We regularly review our hybrid work model, and the specific requirements may change at any time.
Responsibilities:
- End-to-End Service Lifecycle Management: Participate in and continuously improve the full lifecycle of services, from initial design and development to deployment, ongoing operation, and iterative optimization.
- Ensure Reliability and Scalability: Maintain highly reliable, fault-tolerant, and scalable systems that are both cost-effective and efficient, ensuring data, services, and infrastructure meet business needs.
- Performance Troubleshooting: Diagnose and resolve performance issues, including slow queries, resource contention, and bottlenecks across distributed storage engines and services.
- Cluster Scaling and Data Growth: Plan and implement strategies for scaling clusters effectively to accommodate increasing data volume while optimizing performance and cost-efficiency.
- Documentation and Incident Response: Develop and maintain clear runbooks, Standard Operating Procedures (SOPs), and lead sustainable, blameless incident response practices with post-incident analysis to drive continuous improvement.
- Big Data System Design: Architect and implement robust, scalable, and extensible big data systems that support the core business and products, ensuring seamless data flow and system integration.
- On-Call Rotation: Participate in on-call rotations for production incidents, ensuring critical issues are addressed swiftly, with availability to troubleshoot and resolve problems outside of regular business hours as needed.
- Incident Ownership and Analysis: Take ownership of incidents during on-call hours, coordinate escalations as necessary, and conduct thorough post-incident analyses to identify root causes and implement preventive measures.
Minimum Qualifications
- Bachelor's degree in Computer Science, a related technical field involving software or systems engineering, or equivalent practical experience.
- Experience writing code in Java, Scala, Go, Python, or a similar language.
Strong scripting skills (., Bash and Shell) for automation tasks.
- Experience with algorithms, data structures, complexity analysis, and software design: Solid understanding of how to build scalable and efficient systems.
- Basic SQL (MySQL, PostgreSQL, or similar): Strong understanding of traditional relational databases like MySQL or PostgreSQL.
Ability to write queries, perform joins, use aggregate functions, and optimize basic SQL queries.
- Systems and Infrastructure: Knowledge of Linux/Unix systems, as most infrastructure is based on Linux.
Familiarity with system internals, networking, and resource management (memory, CPU, storage).
- Hands-on experience with observability tools such as Prometheus, Grafana, & OpenTSDB: For monitoring, logging, and real-time performance tracking.
- CI/CD Tools: Familiarity with Continuous Integration/Continuous Deployment pipelines and tools (., Jenkins, GitLab CI).
Preferred Qualifications:
- Hands-on experience with distributed processing frameworks like Apache Spark and Apache Flink: Expertise in using these frameworks for large-scale data processing and stream processing.
- Familiarity with Apache Kafka and Apache Hive integrations: Experience with these tools in the context of data pipelines and stream processing.
- HiveSQL Knowledge: Solid understanding of HiveSQL (Hive Query Language - HQL), including the ability to write, optimize, and debug queries on large datasets stored in Hadoop-based systems (like Hive, HDFS, or similar).
- HiveSQL with Kafka Stream Integration: Knowledge and experience in integrating HiveSQL with Kafka Streams or Kafka Connect for real-time data processing
- Experience running production-grade services at scale: Understanding of cloud-native technologies, networking, and storage management to support high-availability and large-scale environments.
- Experience developing tools and APIs: To reduce manual intervention in systems administration and improve automation and operational efficiency.
- Expertise in designing, analyzing, and troubleshooting large-scale systems: Experience with Hadoop, Spark, Hive, Presto, Kafka, Flink, or comparable solutions is a strong plus.

Data Ingestion Platform Engineer

6 days ago

Sydney, New South Wales, Australia beBeeDataIngestion Full time $150,000 - $200,000

Job Summary:\We are seeking a skilled Data Ingestion SRE to join our team and take ownership of managing the services and infrastructures in one of the world's largest data platforms that directly supports the TikTok app.\
Data Ingestion SRE, Data Platform

1 week ago

Sydney, New South Wales, Australia TikTok Full time $104,000 - $130,878 per year

TechnologyData Ingestion SRE, Data Platform -USDSLocation:SydneyEmployment Type:RegularJob Code:A91491AResponsibilitiesSite Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed services and infrastructures. As a site reliability engineer in the data platform area, you will have the...
Data Ingestion SRE, Data Platform

5 days ago

Sydney, New South Wales, Australia TikTok Full time

ResponsibilitiesEnd-to-End Service Lifecycle Management: Participate in and continuously improve the full lifecycle of services, from initial design and development to deployment, ongoing operation, and iterative optimization.Ensure Reliability and Scalability: Maintain highly reliable, fault-tolerant, and scalable systems that are cost-effective and meet...
Data Ingestion SRE, Data Platform

4 days ago

Sydney, New South Wales, Australia TikTok Full time

ResponsibilitiesEnd-to-End Service Lifecycle Management: Participate in and continuously improve the full lifecycle of services, from initial design and development to deployment, ongoing operation, and iterative optimization.Ensure Reliability and Scalability: Maintain highly reliable, fault-tolerant, and scalable systems that are cost-effective and meet...
Data Ingestion Expert

2 weeks ago

Sydney, New South Wales, Australia beBeeSolution Full time $180,000 - $220,000

Job TitleSolution Architect - Data Ingestion and ProcessingJob OverviewThis is a key role for an experienced Solution Architect who can develop and implement data ingestion and processing solutions with expertise in Rust, focusing on massive volumes of sensor data.Key Responsibilities:Ingesting sensor data through batch and real-time offload systemsDesigning...
Specialist in Large-Scale Data Ingestion

1 week ago

Sydney, New South Wales, Australia beBeeData Full time $140,000 - $150,000

Data Pipelining SpecialistWe are passionate about innovation and problem-solving. Our platform automates payroll compliance and employee entitlement verification to ensure accurate pay for workers.As a Data Engineer, you'll design and optimize data pipelines for ingestion and identify trends in data platforms to inform future...
Data Engineering Platform Lead

1 week ago

Sydney, New South Wales, Australia beBeeEngineering Full time $150,000 - $200,000

Job Description:Optiver is a global market maker with offices worldwide. We are a leading liquidity provider committed to improving the market through competitive pricing, execution and risk management.We are looking for an experienced software engineer and team leader to lead our APAC Data Engineering Platform team. The team develops Optiver's...
Chief Data Service Architect

3 days ago

Sydney, New South Wales, Australia Bebeedatasre Full time

Job OverviewWe are seeking a highly skilled Data Ingestion SRE to join our team. As a key member of our data platform, you will be responsible for the end-to-end service lifecycle management of data services.">Lifecycle Management:Participate in and continuously improve the full lifecycle of services, from initial design and development to deployment,...
Expert Data Ingestion Specialist

2 weeks ago

Sydney, New South Wales, Australia beBeeData Full time $90,000 - $150,000

At Yellow Canary, we're passionate about innovation and problem-solving in the complex world of payroll compliance. Our platform automates this process for some of Australia's largest employers, ensuring accurate pay for their workers.About the RoleWe are seeking a skilled Data Engineer to design, build, and optimize data pipelines for ingestion. You'll...
Chief Data Service Architect

4 days ago

Sydney, New South Wales, Australia beBeeDataSRE Full time $120,000 - $240,000

Job OverviewWe are seeking a highly skilled Data Ingestion SRE to join our team. As a key member of our data platform, you will be responsible for the end-to-end service lifecycle management of data services.">Lifecycle Management: Participate in and continuously improve the full lifecycle of services, from initial design and development to deployment,...

Americas

Europe

Asia / Oceania

Africa

Data Ingestion Sre, Data Platform