
Senior Data Engineer
1 week ago
Overview
Maincode is building sovereign AI models in Australia. We are training foundation models from scratch, designing new reasoning architectures, and deploying them on state-of-the-art GPU clusters. Our models are built on datasets we create ourselves, curated, cleaned, and engineered for performance at scale. This is not buying off-the-shelf corpora or scraping without thought. This is building world-class datasets from the ground up.
As a
Senior Data Engineer
, you will lead the design and construction of these datasets. You will work hands-on to source, clean, transform, and structure massive amounts of raw data into training-ready form. You will design the architecture that powers data ingestion, validation, and storage for multi-terabyte to petabyte-scale AI training. You will collaborate with AI Researchers and Engineers to ensure every byte is high quality, relevant, and optimised for training cutting-edge large language models and other architectures.
This is a deep technical role. You will be writing code, building pipelines, defining schemas, and debugging unusual data edge cases at scale. You will think like both a data scientist and a systems engineer, designing for correctness, scalability, and future proofing. If you want to build the datasets that power sovereign AI from first principles, this is your team.
What You'll Do
Design and build large-scale data ingestion and curation pipelines for AI training datasets
Source, filter, and process diverse data types including text, structured data, code, and multimodal data, from raw form to model-ready format
Implement robust quality control and validation systems to ensure dataset integrity, relevance, and ethical compliance
Architect storage and retrieval systems optimised for distributed training at scale
Build tooling to track dataset lineage, reproducibility, and metadata at all stages of the pipeline
Work closely with AI Researchers to align datasets with evolving model architectures and training objectives
Collaborate with DevOps and ML engineers to integrate data systems into large-scale training workflows
Continuously improve ingestion speed, preprocessing efficiency, and data freshness for iterative training cycles
Who You Are
Passionate about building world-class datasets for AI training from raw source to training-ready
Experienced in Python and data engineering frameworks such as Apache Spark, Ray, or Dask
Skilled in working with distributed data storage and processing systems such as S3, HDFS, or cloud object storage
Strong understanding of data quality, validation, and reproducibility in large-scale ML workflows
Familiar with ML frameworks like PyTorch or JAX, and how data pipelines interact with them
Comfortable working with multi-terabyte or larger datasets
Hands-on and pragmatic, you like solving real data problems with code and automation
Motivated to help build sovereign AI capability in Australia
Why Maincode
We are a small team building some of the most advanced AI systems in Australia. We create new foundation models from scratch, not just fine-tune existing ones, and we build the datasets they run on from the ground up. We operate our own GPU clusters, run large-scale training, and integrate research and engineering closely to push the frontier of what is possible.
You will be surrounded by people who:
Care deeply about data quality and architecture, not just volume
Build systems that scale reliably and repeatably
Take pride in learning, experimenting, and shipping
Want to help Australia build independent, world-class AI systems
Compensation Range: A$150K - A$180K
Base pay range
A$150,000.00/yr - A$180,000.00/yr
Seniority level
Mid-Senior level
Employment type
Full-time
Job function
Information Technology
Software Development
Referrals increase your chances of interviewing at Maincode by 2x
#J-18808-Ljbffr
-
Data Engineer
6 days ago
Melbourne, Victoria, Australia Tech & Data People Full time $104,000 - $130,878 per yearAre you the kind of engineer who loves solving tricky data problems, automating the boring stuff, and making things run faster and smarter? This role puts you right in the middle of a cloud-first data environment where you'll have the freedom to design, build and optimise pipelines, bring systems together, and use DevOps to keep everything running...
-
Senior Data Engineers
4 weeks ago
Melbourne, Victoria, Australia Otic Group Full timeJoin to apply for the Senior Data Engineers role at Otic Group1 day ago Be among the first 25 applicants Join to apply for the Senior Data Engineers role at Otic Group"Otic" means smart people doing smart work, together.We are a wholly owned Australian company committed to helping our clients design and build intelligent software solutions that unlock value...
-
Senior Data Engineer
1 week ago
Melbourne, Victoria, Australia SS&C Technologies Full timeOverview Senior Data Engineer role at SS&C Technologies.Location: Melbourne/Sydney/Brisbane, Australia || Hybrid.SS&C GIDS provides information processing and software services across financial markets, customer management, professional services, and output solutions.Why You Will Love It Here Flexibility: Hybrid Work Model Your Future: Income Protection...
-
Senior Data Engineers
2 weeks ago
Melbourne, Victoria, Australia Otic Group Full timeJoin to apply for the Senior Data Engineers role at Otic Group1 day ago Be among the first 25 applicantsJoin to apply for the Senior Data Engineers role at Otic Group"Otic" means smart people doing smart work, together.We are a wholly owned Australian company committed to helping our clients design and build intelligent software solutions that unlock value...
-
Senior Data Engineer
1 week ago
Melbourne, Victoria, Australia Konnexus Full time $150,000 per yearSenior Data Engineer – Azure / DatabricksHybrid Melbourne-Based Role ($150,000 + super + bonus)Make Real Impact / End-to-End Data EngineeringJoin a highly profitable start-upAre you ready to stop maintaining legacy systems and start building something that really matters? Our client is seeking a Senior Data Engineer who wants to own, design, and deliver at...
-
Senior Data Engineer
6 days ago
Melbourne, Victoria, Australia Triniti Full timeOverviewWe're looking for aSenior Data Engineerfor a growing SaaS start up in the PropTech space. You will lead the development of a brand new data platform and shape the future of how property insights are delivered across Australia and New Zealand. In this role, you will directly power AI models and enable smarter, data-driven decision making.This is a...
-
Senior Data Engineer
2 weeks ago
Melbourne, Victoria, Australia Paxus - Technology + Digital Talent Full timeSenior Data Engineer - Investments Melbourne (Hybrid) ContractAre you an experienced Data Engineer ready to take the next step into a leadership role?This newly created opportunity offers the chance to shape the data engineering capability within a leading Australian investment organisation, working at the cutting edge of cloud-based data solutions.The...
-
Senior Data Engineer
1 week ago
Melbourne, Victoria, Australia Paxus Full timeSenior Data Engineer – Investments ?? Melbourne (Hybrid) | Contract Are you an experienced Data Engineer ready to take the next step into a leadership role?This newly created opportunity offers the chance to shape the data engineering capability within a leading Australian investment organisation, working at the cutting edge of cloud-based data...
-
Senior Data Engineer
2 weeks ago
Melbourne, Victoria, Australia SS&C Technologies Full timeOverviewSenior Data Engineer role at SS&C Technologies. Location: Melbourne/Sydney/Brisbane, Australia || Hybrid. SS&C GIDS provides information processing and software services across financial markets, customer management, professional services, and output solutions.Why You Will Love It HereFlexibility: Hybrid Work ModelYour Future: Income Protection...
-
Senior Data Engineer
2 weeks ago
Melbourne, Victoria, Australia SS&C Technologies Full timeOverviewSenior Data Engineer role at SS&C Technologies. Location: Melbourne/Sydney/Brisbane, Australia || Hybrid. SS&C GIDS provides information processing and software services across financial markets, customer management, professional services, and output solutions.Why You Will Love It HereFlexibility: Hybrid Work ModelYour Future: Income Protection...