Training and Simulation
Train and simulate reinforcement learning agents
During training, the agent continuously updates its parameters to learn the optimal policy for a given environment. During simulation, the agent receives observations and a reward from the environment, and returns an action to the environment without updating its parameters.
Reinforcement Learning Toolbox™ provides functions for training agents and validating the training results through simulation. For an introduction to training and simulating agents, see Train Reinforcement Learning Agents.
Apps
| Reinforcement Learning Designer | Design, train, and simulate reinforcement learning agents (Since R2021a) | 
Functions
Topics
Training and Simulation Basics
- Train Reinforcement Learning Agents
 Find the optimal policy by training your agent within a specified environment.
- Train Reinforcement Learning Agent in Basic Grid World
 Train Q-learning and SARSA agents to solve a grid world in MATLAB®.
- Train Reinforcement Learning Agent in MDP Environment
 Train a reinforcement learning agent in a generic Markov decision process environment.
Use the Reinforcement Learning Designer App
- Specify Training Options in Reinforcement Learning Designer
 Interactively specify options for training reinforcement learning agents using the Reinforcement Learning Designer app.
- Specify Simulation Options in Reinforcement Learning Designer
 Interactively specify options for simulating reinforcement learning agents using the Reinforcement Learning Designer app.
- Design and Train Agent Using Reinforcement Learning Designer
 Design and train a DQN agent for a cart-pole system using the Reinforcement Learning Designer app.
- Tune Hyperparameters Using Reinforcement Learning Designer
 Search the hyperparameter space using Reinforcement Learning Designer.
Train Agents for Simulink Environment
- Control Water Level in a Tank Using a DDPG Agent
 Train a controller using reinforcement learning with a plant modeled in Simulink® as the training environment.
Use Multiple Processes and GPUs
- Train Agents Using Parallel Computing and GPUs
 Accelerate agent training by running simulations in parallel on multiple cores, GPUs, clusters or cloud resources.
- Train AC Agent to Balance Discrete Cart-Pole System Using Parallel Computing
 Train an AC agent to control a discrete action space cart-pole system using asynchronous parallel computing.
- Train DQN Agent for Lane Keeping Assist Using Parallel Computing
 Train a DQN agent for an automated driving application using parallel computing.
Training and Simulation Advanced
- Train PPO Agent with Curriculum Learning for a Lane Keeping Application
 Train a PPO agent for a lane keeping assist task by gradually increasing task complexity.
- Train DQN Agent Using Hindsight Experience Replay
 Train a DQN agent in a navigation environment with sparse rewards.
- Train Reinforcement Learning Agent Offline to Control Quanser QUBE Pendulum
 Train TD3 agent offline to control a Quanser QUBE pendulum.
- Train Biped Robot to Walk Using Evolution Strategy-Reinforcement Learning Agents
 Train TD3 agent using evolutionary strategy.
- Create DQN Agent Using Deep Network Designer and Train Using Image Observations
 Create a reinforcement learning agent using the Deep Network Designer app from the Deep Learning Toolbox™.
Log Training Data and Tune Hyperparameters
- Log Training Data to Disk
 Log a variety of data to disk while training an agent.
- Train Agent or Tune Environment Parameters Using Parameter Sweeping
 Tune a DDPG agent using hyperparameter sweeping.
- Tune Hyperparameters Using Bayesian Optimization
 Tune reinforcement learning hyperparameters using Bayesian optimization.
- Configure Exploration for Reinforcement Learning Agents
 Use visualization to configure exploration in reinforcement learning agents.
Multi-Agent Training
- Train Multiple Agents to Perform Collaborative Task
 Train two continuous action space PPO agents to collaboratively move an object.
- Train Multiple Agents for Area Coverage
 Train three discrete action space PPO agents to explore a grid-world environment in a collaborative-competitive manner.
- Train Multiple Agents for Path Following Control
 Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.
Develop Custom Agents and Training Algorithms
- Train Reinforcement Learning Policy Using Custom Training Loop
 Train a reinforcement learning policy using your own custom training loop.
- Create and Train Custom PG Agent
 Create a custom PG agent and train it using the built-in train function.
- Create and Train Custom LQR Agent
 Create a custom agent that solves an LQR problem and train it using the built-in train function.
- Custom PPO Training Loop With Random Network Distillation
 Use a custom training loop to train a custom PPO policy with random network distillation on a pendulum environment with sparse rewards.
- Custom Training Loop with Simulink Action Noise
 Use a custom training loop to train a continuous action space reinforcement learning policy in Simulink when action noise is generated within the model.
Train Model Based Policy Optimization Agents
- Train MBPO Agent to Balance Continuous Cart-Pole System
 A model-based reinforcement learning agent learns a model of its environment that it can use to generate additional experiences for training.
- Model-Based Reinforcement Learning Using Custom Training Loop
 Create a model-based reinforcement learning agent using a custom training loop.