Overview

Unity ML-Agents is an open-source toolkit developed by Unity Technologies that bridges the gap between the Unity real-time 3D development platform and machine learning frameworks. First released in 2017, the toolkit provides tools to train intelligent agents using reinforcement learning, imitation learning, and other machine learning methods within Unity environments. It is designed for developers, researchers, and AI practitioners looking to create complex, adaptive behaviors for virtual characters, robots, or autonomous systems.

The core functionality of ML-Agents revolves around the ability to design sophisticated simulation environments in Unity, where agents can interact and learn. These environments can range from simple physics puzzles to complex multi-agent combat scenarios or robotic manipulation tasks. The toolkit exposes a Python API, allowing researchers and developers to define training algorithms using popular machine learning libraries like TensorFlow or PyTorch, while the C# components within Unity handle the simulation and agent interaction logic. This separation allows for rapid iteration on both environment design and learning algorithms.

Unity ML-Agents is particularly well-suited for applications requiring simulated learning, such as developing AI for video games, prototyping robotic control systems in a virtual space, or conducting academic research in reinforcement learning. Its visual nature within the Unity editor makes it accessible for understanding agent behavior and environmental dynamics during the training process. For example, a game developer might use ML-Agents to train non-player characters (NPCs) to exhibit realistic pathfinding, combat strategies, or even social interactions without needing to manually script every possible action. Similarly, robotics engineers can simulate complex tasks like grasping or navigation, reducing the need for extensive physical hardware testing during early development phases. The toolkit supports various observation spaces (vector, visual, & text) and action spaces (discrete & continuous), providing flexibility for diverse agent learning problems per the official Unity ML-Agents overview documentation.

Key features

  • Unity Environment Integration: Seamlessly integrates with the Unity editor, allowing developers to design 3D environments and agents directly within the familiar Unity interface for training and simulation.
  • Python API: Provides a Python programming interface for defining and executing machine learning training, compatible with popular frameworks like TensorFlow and PyTorch. This allows for flexible algorithm development outside of Unity.
  • Reinforcement Learning Algorithms: Includes implementations for various reinforcement learning algorithms, such as Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC), and Behavioral Cloning, enabling diverse training strategies per the PPO training guide.
  • Imitation Learning: Supports imitation learning, where agents learn by observing expert demonstrations, reducing the need for extensive reward function design in certain scenarios.
  • Curriculum Learning: Facilitates curriculum learning, allowing complex tasks to be broken down into simpler stages, progressively increasing difficulty to aid agent learning and stability.
  • Multi-Agent Support: Enables the simulation and training of multiple agents concurrently within the same environment, or even adversarial scenarios, providing tools for complex interaction analysis.
  • Visual Observation Capabilities: Agents can process visual inputs from Unity cameras, allowing them to learn from raw pixel data, which is crucial for tasks requiring environmental perception.
  • Analytics and Monitoring: Integrates with TensorBoard for monitoring training progress, reward functions, and other metrics, providing insights into the learning process.
  • Open-Source: The entire toolkit is open-source, allowing for community contributions, modifications, and transparency in its development and implementation details.
  • Robotics Simulation: Offers features relevant to robotics, such as physics-based environments and control interfaces, making it suitable for training robotic arms, autonomous vehicles, and other physical systems in simulation.

Pricing

Unity ML-Agents is an entirely open-source project and is available for use without licensing fees. Its components can be integrated into Unity projects under a permissive license.

Feature Details
Software Cost Free and open-source
License MIT License (as of 2026-05-28)
Support Community-driven via GitHub and forums

For detailed licensing information, developers can refer to the official Unity ML-Agents documentation.

Common integrations

  • TensorFlow & PyTorch: The Python API allows ML-Agents to interface with popular deep learning frameworks for defining and executing training algorithms.
  • TensorBoard: Integrates with TensorBoard for visualizing training metrics, agent rewards, and other diagnostic information during the learning process.
  • OpenAI Gym: While ML-Agents provides its own environment API, it shares conceptual similarities with OpenAI Gym's environment interface, making it familiar to researchers experienced with external RL frameworks.
  • Unity Editor: The toolkit is built to work within the Unity development environment, leveraging its scene creation, physics engine, and rendering capabilities.

Alternatives

  • OpenAI Gym: A toolkit for developing and comparing reinforcement learning algorithms, offering a collection of environments.
  • Ray RLlib: An open-source library for reinforcement learning that supports a wide range of algorithms and integrations with various simulation environments, including a Unity-specific integration guide.
  • Stable Baselines3: A set of state-of-the-art reinforcement learning algorithms implemented in PyTorch, providing a high-quality codebase for research and development.
  • DeepMind Lab: A 3D platform for agent-based AI research, offering a suite of challenging navigation and puzzle-solving tasks.
  • Isaac Gym: NVIDIA's GPU-accelerated simulation environment for robotic learning, focusing on high-throughput simulation for reinforcement learning.

Getting started

To begin with Unity ML-Agents, you first need a Unity project set up. This example demonstrates a basic Unity environment with an agent and a Python script to train it to reach a target.

Step 1: Set up Unity Project

  1. Create a new 3D Unity project.
  2. Open the Package Manager (Window > Package Manager).
  3. Click the '+' icon, select "Add package from git URL...", and enter com.unity.ml-agents. Install the package.
  4. Create an empty GameObject in your scene, name it "Agent".
  5. Add a BehaviorParameters component and an Agent component to the "Agent" GameObject. Set the Behavior Name in BehaviorParameters (e.g., "MyAgent").
  6. Create a C# script (e.g., MyAgent.cs), attach it to the "Agent" GameObject, and implement the agent's logic.
using Unity.MLAgents;
using Unity.MLAgents.Actuators;
using Unity.MLAgents.Sensors;
using UnityEngine;

public class MyAgent : Agent
{
    public Transform Target;
    public float moveSpeed = 1f;

    public override void OnEpisodeBegin()
    {
        // Reset agent and target positions for a new episode
        if (this.transform.localPosition.y < -0.1f)
        {
            this.transform.localPosition = new Vector3(0, 0.5f, 0);
        }
        Target.localPosition = new Vector3(Random.value * 8 - 4, 0.5f, Random.value * 8 - 4);
    }

    public override void CollectObservations(VectorSensor sensor)
    {
        // Agent and Target positions
        sensor.AddObservation(Target.localPosition);
        sensor.AddObservation(this.transform.localPosition);

        // Agent velocity
        sensor.AddObservation(GetComponent().velocity);
    }

    public override void OnActionReceived(ActionBuffers actionBuffers)
    {
        // Move the agent using direct actions
        Vector3 controlSignal = Vector3.zero;
        controlSignal.x = actionBuffers.ContinuousActions[0];
        controlSignal.z = actionBuffers.ContinuousActions[1];
        GetComponent().AddForce(controlSignal * moveSpeed);

        // Rewards
        float distanceToTarget = Vector3.Distance(this.transform.localPosition, Target.localPosition);

        // Reached target
        if (distanceToTarget < 1.4f)
        {
            SetReward(1.0f);
            EndEpisode();
        }

        // Fell off platform
        if (this.transform.localPosition.y < -0.1f)
        {
            SetReward(-1.0f);
            EndEpisode();
        }
    }

    public override void Heuristic(in ActionBuffers actionsOut)
    {
        var continuousActionsOut = actionsOut.ContinuousActions;
        continuousActionsOut[0] = Input.GetAxis("Horizontal");
        continuousActionsOut[1] = Input.GetAxis("Vertical");
    }
}

Step 2: Install ML-Agents Python Package

Make sure you have Python installed (3.6 or later). Then, install the ML-Agents package:

pip install mlagents

Step 3: Train the Agent

Navigate to your Unity project's root directory in the terminal. Run the training command:

mlagents-learn config/ppo/MyAgent.yaml --run-id=MyAgentRun --time-scale=10

You'll need a MyAgent.yaml configuration file in a config/ppo/ directory (relative to where you run the command). A basic configuration might look like this:

behaviors:
  MyAgent:
    trainer: ppo
    hyperparameters:
      batch_size: 10
      buffer_size: 100
      learning_rate: 3.0e-4
      epsilon: 0.2
      beta: 5.0e-4
      lambd: 0.99
      num_epoch: 3
    network_settings:
      hidden_units: 128
      num_layers: 2
      normalize: false
      seed: 0
    reward_signals:
      extrinsic:
        gamma: 0.99
        strength: 1.0
    max_steps: 500000
    time_horizon: 64
    summary_freq: 10000
    threaded: true

When the command runs, click the "Play" button in the Unity editor to start the simulation. The Python script will connect to the Unity environment and begin training the agent. You can monitor the training progress using TensorBoard by running tensorboard --logdir results in another terminal and navigating to the displayed URL.