Main Content


Create and configure reinforcement learning agents using common algorithms, such as SARSA, DQN, DDPG, and PPO

A reinforcement learning agent receives observations and a reward from the environment. Using its policy, the agent selects an action based on the observations and reward, and sends the action to the environment. During training, the agent continuously updates the policy parameters based on the action, observations, and reward. Doing so, allows the agent to learn the optimal policy for the given environment and reward signal.

Reinforcement Learning Toolbox™ software provides reinforcement learning agents that use several common algorithms, such as SARSA, DQN, DDPG, and PPO. You can also implement other agent algorithms by creating your own custom agents.

For more information, see Reinforcement Learning Agents. For more information on defining policy representations, see Create Policies and Value Functions.


Reinforcement Learning DesignerDesign, train, and simulate reinforcement learning agents


expand all

rlQAgentQ-learning reinforcement learning agent
rlSARSAAgentSARSA reinforcement learning agent
rlDQNAgentDeep Q-network reinforcement learning agent
rlPGAgentPolicy gradient reinforcement learning agent
rlDDPGAgentDeep deterministic policy gradient reinforcement learning agent
rlTD3AgentTwin-delayed deep deterministic policy gradient reinforcement learning agent
rlACAgentActor-critic reinforcement learning agent
rlPPOAgentProximal policy optimization reinforcement learning agent
rlTRPOAgentTrust region policy optimization reinforcement learning agent
rlSACAgentSoft actor-critic reinforcement learning agent
rlQAgentOptionsOptions for Q-learning agent
rlSARSAAgentOptionsOptions for SARSA agent
rlDQNAgentOptionsOptions for DQN agent
rlPGAgentOptionsOptions for PG agent
rlDDPGAgentOptionsOptions for DDPG agent
rlTD3AgentOptionsOptions for TD3 agent
rlACAgentOptionsOptions for AC agent
rlPPOAgentOptionsOptions for PPO agent
rlTRPOAgentOptionsOptions for TRPO agent
rlSACAgentOptionsOptions for SAC agent
rlAgentInitializationOptionsOptions for initializing reinforcement learning agents
rlMBPOAgentModel-based policy optimization reinforcement learning agent
rlMBPOAgentOptionsOptions for MBPO agent
getActorGet actor from reinforcement learning agent
getCriticGet critic from reinforcement learning agent
setActorSet actor of reinforcement learning agent
setCriticSet critic of reinforcement learning agent
getActionObtain action from agent or actor given environment observations
rlReplayMemoryReplay memory experience buffer
appendAppend experiences to replay memory buffer
sampleSample experiences from replay memory buffer
resetReset environment, agent, experience buffer, or policy object


Agent Basics

Agent Types

Custom Agents