Expert In Reinforcement Learning

4 days ago


Sydney, New South Wales, Australia Bebeereinforcement Full time

About the Role
As a seasoned expert in Reinforcement Learning, you will be responsible for developing and optimizing RL models for large-scale enterprise applications such as customer service, token reporting, compliance, and Web3 domain reasoning.
This role requires a strong theoretical foundation in RL, covering policy optimization, reward modeling, and planning, paired with engineering skills to build scalable production systems. You will take full ownership from research through deployment, driving experimentation with systematic evaluation and benchmarking.
Responsibilities:
Research and develop cutting-edge RL algorithms, focusing on large model optimization and alignment techniques.
Design and implement RL training pipelines, including environment simulation, data generation, and reward function design.
Apply RL methods to enhance Large Language Models (LLMs), Vision-Language Models (VLMs), and Agentic AI capabilities in reasoning, planning, and autonomous decision-making.
Collaborate with engineers and researchers to integrate RL solutions into enterprise AI platforms.
Monitor model performance in production and continuously improve through iterative training and fine-tuning.
Requirements:
Master's degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
3+ years of hands-on experience in RL or LLM/VLM/Agentic AI optimization.
Strong coding skills in Python, with experience in ML frameworks and RL libraries.
Experience with large-scale distributed training and optimization.
Self-driven, ownership mindset, and strong problem-solving skills. Excellent communication skills for cross-functional collaboration.
Why Join Us

• Shape the future with our leading blockchain ecosystem

• Collaborate with world-class talent in a user-centric global organization with a flat structure

• Tackle unique, fast-paced projects with autonomy in an innovative environment

• Thrive in a results-driven workplace with opportunities for career growth and continuous learning

• Competitive salary and company benefits

• Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)
We are committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.



  • Sydney, New South Wales, Australia beBeeReinforcement Full time $200,000 - $250,000

    About the RoleAs a seasoned expert in Reinforcement Learning, you will be responsible for developing and optimizing RL models for large-scale enterprise applications such as customer service, token reporting, compliance, and Web3 domain reasoning.This role requires a strong theoretical foundation in RL, covering policy optimization, reward modeling, and...


  • Sydney, New South Wales, Australia beBeeLeadership Full time $110,000 - $130,000

    Reinforcement Operations Manager Job Description:We are seeking an experienced Reinforcement Operations Manager to join our team. In this key leadership role, you will be responsible for managing a team of reinforcing detailers, overseeing day-to-day operational activities, and ensuring seamless communication between departments.The ideal candidate will...


  • Sydney, New South Wales, Australia Fonix Logistics Full time $60,000 - $80,000 per year

    Reinforcing Steel Industry - HR Truck DriverFonix Logistics is proud to deliver transport services for Reosteel, a fast-growing supplier of reinforcing steel products and accessories in the NSW construction market. Reosteel is certified to ISO 9001, ISO 14001, ISO 45001 and ACRS standards, and is committed to quality, safety, and sustainability across every...


  • Sydney, New South Wales, Australia Ausreo Full time

    **Introduction**:Who we are?Ausreo is Australia's leading independent manufacturer and supplier of concrete reinforcing products to the building and construction industry. We support and encourage our business partners to Build with Confidence and create value for our people, customers and shareholders. We are known for:Our agility to provide superior...


  • Sydney, New South Wales, Australia beBeeData Full time $120,000 - $180,000

    Binance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products.About the...


  • Sydney, New South Wales, Australia beBeeMachinist Full time $60,000 - $80,000

    We are seeking a highly skilled and reliable Production Machine Operator to join our steel reinforcing manufacturing team. This is an exciting opportunity to work in a fast-paced environment, operating complex machinery and contributing to the production of high-quality steel products.Job Description:As a Production Machine Operator, you will be responsible...


  • Sydney, New South Wales, Australia beBeeReinforcing Full time $65,280 - $76,300

    Steel Reinforcing Bar HandlerWe are seeking a reliable and skilled individual to join our manufacturing team.This is an excellent opportunity for someone with manual handling experience and a commitment to learning, as full training is provided.Duties and ResponsibilitiesOperate machines such as shearers and bendersCarefully maneuver overhead cranesLoad and...


  • Sydney, New South Wales, Australia Buscojobs Full time

    Reinforcing Solutions Engineer Sydney jobs in ...2010 Surry Hills, New South Wales IdeagenPosted todayJob DescriptionSolutions Engineer - SydneyAbout Us:Location -New South Wales/Queensland/Victoria, AustraliaFunction - Pre-salesDepartment - SalesWorking Pattern - Hybrid; Full-timeBenefits – Benefits and RewardsSalary - this will be discussed at the next...


  • Sydney, New South Wales, Australia beBeeReinforcement Full time $120,000 - $200,000

    About This RoleAs a seasoned professional in the field of Reinforcement Learning, you will be responsible for developing and optimizing RL models for large-scale enterprise applications.You will research and develop cutting-edge RL algorithms focusing on large model optimization and alignment techniques.Design and implement RL training pipelines including...


  • Sydney, New South Wales, Australia Buscojobs Full time

    Reinforcing Solutions Engineer Sydney jobs in ...2010 Surry Hills, New South Wales IdeagenPosted todayJob DescriptionSolutions Engineer - Sydney About Us:Location -New South Wales/Queensland/Victoria, AustraliaFunction - Pre-salesDepartment - SalesWorking Pattern - Hybrid; Full-timeBenefits – Benefits and RewardsSalary - this will be discussed at the next...