Training Your Own Agent in Unity ML-Agents
This guide walks you through how to run machine learning training on an agent inside Unity using the ML-Agents Toolkit. You’ll train an agent to learn behavior through reinforcement learning—where it improves by trial and error using reward feedback.
Before starting, make sure you've already set up the Python environment using venv.
Step 1: Activate the Python Environment
You must activate the virtual environment before training.
Windows Power Shell:
macOS/Linux:
If you see
(venv)
in your terminal prompt, you're ready.
Step 2: Locate the Training Configuration File
Each example environment in ML-Agents has a corresponding .yaml
config file that defines:
-
The training algorithm (PPO, SAC, etc.)
-
Learning rate
-
Batch size
-
Network architecture
-
Reward signals
For 3DBall, the config is located here:
You can open it in any text editor if you're curious—but no changes are needed to begin. Here is an example from the 3D ball:
Step 3: Configure Unity Editor for Training
Before you start training, make sure Unity is configured properly. These settings allow ML-Agents to communicate with the Python trainer and ensure smooth simulation.
1. Open the 3DBall Scene
In Unity:
-
Navigate to:
Assets/ML-Agents/Examples/3DBall/Scenes/
-
Double-click
3DBall.unity
to open it
2. Select the Agent
In the Hierarchy, select the Agent GameObject in one of the 3DBall GameObjects, The 3DBall GameObject is the platform and ball combination that the agent controls.
3. Configure the Behavior Parameters
With the Agent selected, go to the Inspector and find the component:
Behavior Parameters
Set the following:
Setting |
Value |
Behavior Name |
3DBall |
Vector
Observation |
8 (already
set) |
Action Type |
Continuous |
Actions |
Size = 2 |
Behavior Type |
Default |
Model |
Select None to leave it blank
while training |
4. Add a Decision Requester Script (if missing)
Still on the Agent GameObject, make sure it has this component:
Decision Requester
If it's missing:
-
Click Add Component
-
Search for
Decision Requester
and add it -
Set:
-
Decision Period =
5
(default) -
Take Actions Between Decisions = ✅ (checked)
This tells Unity when to ask the Python policy for a new action.
5. Time Scale (Optional)
To make training faster:
-
In the top-right corner of the Game window, find the Time Scale setting (click the gear ⚙️ if needed)
-
Increase it to something like
10
or20
This speeds up the physics simulation without affecting training accuracy.
Step 4: Start the Training Process
Now you’ll launch the training script that connects Unity to the ML training backend (using PyTorch).
Run this in the terminal:
What this command does:
-
It loads the
3DBall.yaml
training configuration -
It creates a new training run ID called
My3DBallRun
-
It prepares to receive simulation data from Unity
You’ll see a message like:
Step 5: Press Play in Unity to Begin Training
-
Return to the Unity Editor
-
Ensure the
3DBall
scene is open -
Click the Play ▶️ button at the top of the Unity window
You should see:
-
The platforms begin moving
-
Ball movement looks clumsy at first (because the agent is untrained)
-
Your terminal will begin printing training progress (reward scores, step count, etc.)
The agent is now learning by interacting with the environment and adjusting its neural network weights using reinforcement learning.
Step 6: Monitor the Training Progress
In the terminal, you’ll see output like:
-
"Mean Reward" tells you how well the agents are doing
-
Over time, the value should rise from near zero to ~1.8+ (perfect balance)
Training typically takes 5–15 minutes depending on your computer and settings.
Step 7: Stop Training and Save the Model
You can stop training early by pressing Ctrl + C
in the terminal.
The model is saved automatically to:
Inside, you’ll find:
-
My3DBallRun.onnx
→ the trained neural network model -
Training logs (for TensorBoard)
Step 8: Use Your Trained Model in Unity
Now you'll plug in your custom-trained .onnx
model and test it inside Unity.
How to do it:
-
Drag
My3DBallRun.onnx
fromresults/
into: -
In Unity, select the Ball3DAgent GameObject (in the Hierarchy)
-
In the Behavior Parameters component:
-
Under Model, drag in your new
.onnx
file -
Set Behavior Type to:
Inference Only
-
-
Press Play in Unity
You should now see a competent agent balancing the ball!
Optional Step 9: Experiment
Now that you’ve trained your own agent, try these:
-
Modify the reward function (in the C#
Agent
script) -
Add obstacles to the environment
-
Change the
.yaml
training parameters (batch size, hidden layers, etc.) -
Increase Unity’s Time Scale to speed up simulation
Rerun Training with Different Names
To run another training session, just change the --run-id
:
This keeps your previous model safe and avoids overwriting results.