AI Reinforcement Learning Model Optimization Specialist

2 weeks ago


Sydney, New South Wales, Australia beBeeData Full time $120,000 - $180,000

Binance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products.

About the Role

Our team seeks a skilled Data Scientist to develop and optimize Reinforcement Learning (RL) models for enterprise-scale applications. The successful candidate will explore advanced algorithms including PPO, GRPO, DPO, RLHF, RLAIF, and Agentic RL to enhance the capabilities of Large Language Models (LLMs), Visual Language Models (VLMs), and Agentic AI.


Key Responsibilities:
  • Research and develop state-of-the-art RL algorithms focusing on large model optimization and alignment techniques.
  • Design and implement RL training pipelines including environment simulation, data generation, and reward function design.
  • Apply RL methods to enhance LLM/VLM/Agentic AI capabilities in reasoning, planning, and autonomous decision-making.
  • Collaborate with engineers and researchers to integrate RL solutions into enterprise AI platforms.
  • Monitor model performance in production and continuously improve through iterative training and fine-tuning.
Requirements:
  • Masters degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
  • 3+ years of hands-on experience in RL or LLM/VLM/Agentic AI optimization.
  • Strong coding skills in Python, with experience in ML frameworks and RL libraries.
  • Experience with large-scale distributed training and optimization.
  • Self-driven, ownership mindset, and strong problem-solving skills Excellent communication skills for cross-functional collaboration.

Why this Opportunity?


• Shape the future with the world's leading blockchain ecosystem

• Collaborate with world-class talent in a user-centric global organization

• Tackle unique, fast-paced projects with autonomy in an innovative environment

• Thrive in a results-driven workplace with opportunities for career growth and continuous learning

• Competitive salary and company benefits

• Work-from-home arrangement (the arrangement may vary depending on the work nature of the business team)

This opportunity offers a chance to drive experimentation with systematic evaluation and benchmarking.



  • Sydney, New South Wales, Australia beBeeReinforcementLearning Full time $169,920 - $221,345

    Binance is a leading global blockchain ecosystem behind the world's largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products.We...


  • Sydney, New South Wales, Australia beBeeReinforcement Full time $180,000 - $250,000

    Reinforcement Learning Model DeveloperDevelop and optimize Reinforcement Learning models for enterprise-scale applications. Explore and evaluate advanced algorithms to enhance capabilities of Large Language Models (LLMs), Visual Language Models (VLMs), and Agentic AI.This role requires a strong theoretical foundation in Reinforcement Learning, covering...


  • Sydney, New South Wales, Australia Bebeereinforcement Full time

    About the RoleAs a seasoned expert in Reinforcement Learning, you will be responsible for developing and optimizing RL models for large-scale enterprise applications such as customer service, token reporting, compliance, and Web3 domain reasoning.This role requires a strong theoretical foundation in RL, covering policy optimization, reward modeling, and...


  • Sydney, New South Wales, Australia beBeeReinforcement Full time $200,000 - $250,000

    About the RoleAs a seasoned expert in Reinforcement Learning, you will be responsible for developing and optimizing RL models for large-scale enterprise applications such as customer service, token reporting, compliance, and Web3 domain reasoning.This role requires a strong theoretical foundation in RL, covering policy optimization, reward modeling, and...


  • Sydney, New South Wales, Australia beBeeDeployment Full time $130,000 - $170,000

    Advanced Model Deployment SpecialistWe are seeking a highly skilled and experienced model deployment specialist to join our team at Cloudflare.The ideal candidate will have a strong background in machine learning, experience with popular open-source libraries and inference engines, and a passion for working with customers to bring their AI applications to...


  • Sydney, New South Wales, Australia beBeeMachineLearning Full time $80,000 - $130,000

    AI Model Operations SpecialistWe are seeking an experienced AI operations specialist to deploy, maintain, and optimize AI/ML models and supporting infrastructure.Key Responsibilities:Deploy machine learning models into production environments.Monitor and maintain model performance ensuring ongoing accuracy and efficiency.Implement automated model retraining...

  • AI Model Deployer

    2 weeks ago


    Sydney, New South Wales, Australia beBeeModeler Full time $160,000 - $220,000

    AI Model DeployerWe are seeking a highly skilled and experienced machine learning engineer to help customers deploy their AI models on Cloudflare Workers AI.This involves packaging and deploying models, adding new open-source models to the catalog, optimizing model deployment and performance, and deeply understanding and debugging generative AI...

  • AI Model Developer

    2 weeks ago


    Sydney, New South Wales, Australia beBeeMachineLearning Full time US$150,000 - US$180,000

    Job SummaryWe are seeking a skilled Machine Learning Engineer to join our team.This is an excellent opportunity for someone who wants to work on cutting-edge AI projects and collaborate with customers.About the RoleKey ResponsibilitiesAs a Forward Deployed Machine Learning Engineer, you will work closely with customers to help them develop and deploy their...


  • Sydney, New South Wales, Australia beBeeArtificial Full time $180,000 - $250,000

    AI Innovation RoleEML is a leading provider of workers compensation and personal injury claims management services. Our goal is to help people regain control over their lives through ongoing support during their return-to-work journey.The OpportunityWe are at the forefront of the AI revolution, building innovative products that leverage the latest...


  • Sydney, New South Wales, Australia beBeeaiengineer Full time $150,000 - $200,000

    Machine Learning Engineer - AI Inference SpecialistWe're seeking an exceptional Machine Learning Engineer with a strong background in Artificial Intelligence (AI) inference to join our team. As an AI Inference Specialist, you will be responsible for helping our customers bring their AI applications to life on our platform.Job Description:This is an exciting...