Senior Data Engineer
6 days ago
Overview
Maincode is building sovereign AI models in Australia. We are training foundation models from scratch, designing new reasoning architectures, and deploying them on state-of-the-art GPU clusters. Our models are built on datasets we create ourselves, curated, cleaned, and engineered for performance at scale. This is not buying off-the-shelf corpora or scraping without thought. This is building world-class datasets from the ground up.
As a
Senior Data Engineer
, you will lead the design and construction of these datasets. You will work hands-on to source, clean, transform, and structure massive amounts of raw data into training-ready form. You will design the architecture that powers data ingestion, validation, and storage for multi-terabyte to petabyte-scale AI training. You will collaborate with AI Researchers and Engineers to ensure every byte is high quality, relevant, and optimised for training cutting-edge large language models and other architectures.
This is a deep technical role. You will be writing code, building pipelines, defining schemas, and debugging unusual data edge cases at scale. You will think like both a data scientist and a systems engineer, designing for correctness, scalability, and future proofing. If you want to build the datasets that power sovereign AI from first principles, this is your team.
What You'll Do
- Design and build large-scale data ingestion and curation pipelines for AI training datasets
- Source, filter, and process diverse data types including text, structured data, code, and multimodal, from raw form to model-ready format
- Implement robust quality control and validation systems to ensure dataset integrity, relevance, and ethical compliance
- Architect storage and retrieval systems optimised for distributed training at scale
- Build tooling to track dataset lineage, reproducibility, and metadata at all stages of the pipeline
- Work closely with AI Researchers to align datasets with evolving model architectures and training objectives
- Collaborate with DevOps and ML engineers to integrate data systems into large-scale training workflows
- Continuously improve ingestion speed, preprocessing efficiency, and data freshness for iterative training cycles
Who You Are
- Passionate about building world-class datasets for AI training from raw source to training-ready
- Experienced in Python and data engineering frameworks such as Apache Spark, Ray, or Dask
- Skilled in working with distributed data storage and processing systems such as S3, HDFS, or cloud object storage
- Strong understanding of data quality, validation, and reproducibility in large-scale ML workflows
- Familiar with ML frameworks like PyTorch or JAX, and how data pipelines interact with them
- Comfortable working with multi-terabyte or larger datasets
- Hands-on and pragmatic, you like solving real data problems with code and automation
- Motivated to help build sovereign AI capability in Australia
Why Maincode
We are a small team building some of the most advanced AI systems in Australia. We create new foundation models from scratch, not just fine-tune existing ones, and we build the datasets they run on from the ground up.
We operate our own GPU clusters, run large-scale training, and integrate research and engineering closely to push the frontier of what is possible.
You Will Be Surrounded By People Who
- Care deeply about data quality and architecture, not just volume
- Build systems that scale reliably and repeatably
- Take pride in learning, experimenting, and shipping
- Want to help Australia build independent, world-class AI systems
-
Senior Data Engineer
2 weeks ago
Melbourne, Victoria, Australia Konnexus Full time $120,000 - $180,000 per yearAt Konnexus, we're partnering with a trusted client who is looking to bring on a Senior Data Engineer in Sydney or Melbourne. This role involves leading data driven solution that help to transform clients' landscapes and solve problems through the power of data. As a Senior Data Engineer, you'll be great on the tools and speaking with non-technical...
-
Senior Data Engineer
4 days ago
Melbourne, Victoria, Australia Digiscale Full time $120,000 - $150,000 per yearPosition: Senior Data EngineerCompany Overview:Our client is a leading technology company based in Melbourne, Australia. They specialise in providing innovative data solutions to businesses of all sizes. Their team is made up of experienced professionals who are passionate about using data to drive business growth and success.Job Overview:Our client is...
-
Senior Data Engineer
4 days ago
Melbourne, Victoria, Australia Method Recruitment Group Full time $120,000 - $180,000 per yearSenior Data Engineer – 6-Month Contract | Melbourne (Hybrid)Our client is seeking an experienced Senior Data Engineer to join their team on aninitial 6-month contract, with strong potential for extension. This is afull-time role based in Melbourne, offering flexibility with2–3 days a week in the officeand the remainder remote.About the RoleYou'll play a...
-
Senior Data Platform Engineer
1 week ago
Melbourne, Victoria, Australia Capgemini Australia Pty Ltd Full time $100,000 - $150,000 per yearCompany description: Come and join a thriving company and become part of a diverse global collective of free-thinkers, entrepreneurs and industry experts who are all driven to use technology to reimagine what's possible.For more about why Capgemini Job description: The Senior Data Platform Engineer role is responsible for the design, build, support and...
-
Senior Data Platform Engineer
1 week ago
Melbourne, Victoria, Australia Capgemini Full time $120,000 - $180,000 per yearChoosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of technology and...
-
Senior Data Engineer
1 week ago
Melbourne, Victoria, Australia KPMG Full time $120,000 - $180,000 per yearOur Connected Technology Group (CTG) defines and drives the digital, data and technology strategy for KPMG. We have an important advocacy role for technology in the market and across KPMG, working with our technology leaders to build our market presence. We cultivate collaboration and integrate tech execution across our business, driving a firmwide approach...
-
Senior Data Engineers
6 days ago
Melbourne, Victoria, Australia Otic Group Full time $120,000 - $180,000 per year"Otic" means smart people doing smart work, together.We are a wholly owned Australian company committed to helping our clients design and build intelligent software solutions that unlock value in their business.Otic Group was formed to provide talented technology professionals an opportunity to work with not only some of the most prominent companies in...
-
Senior Data Engineer
2 weeks ago
Melbourne, Victoria, Australia Ippon Technologies Full time $120,000 - $180,000 per yearIppon Australia is a team of true technologists who leverage contemporary thinking and tools to enhance the way our clients drive value. Whether it's a bespoke platform to improve employee engagement within a national retail chain, or developing an app that provides next level direct customer interaction; value is delivered to our clients when we begin by...
-
Senior Data Platform Engineer
1 week ago
Melbourne, Victoria, Australia Capgemini Full time $120,000 - $180,000 per yearChoosing Capgemini means choosing a company where you will be empowered to shape your career in the way you'd like, where you'll be supported and inspired by a collaborative community of colleagues around the world, and where you'll be able to reimagine what's possible. Join us and help the world's leading organizations unlock the value of technology and...
-
Senior Data Scientist
2 weeks ago
Melbourne, Victoria, Australia Tech & Data People Full time $120,000 - $180,000 per yearIf you're the kind of data scientist who loves diving deep into probabilities, simulation models, and complex statistical challenges — this is a great role for you.You'll join a growing team that's building out the industry leading data science projects. This isn't about stakeholder decks or endless meetings; it's a hands-on role for someone who loves...