Main Content

Agents

Create and configure reinforcement learning agents

A reinforcement learning agent receives observations and a reward from the environment, and returns an action to the environment. During training, the agent continuously updates its parameters to improve its policy for the given environment.

Reinforcement Learning Toolbox™ software provides built-in reinforcement learning agents that use several common algorithms, such as Q-Learning, DQN, PG, AC, DDPG, TD3, SAC and PPO. You can also implement your own custom agents.

For an introduction to agents, see Reinforcement Learning Agents. For an introduction to policies, value functions, actors and critics, see Create Policies and Value Functions.

Apps

Reinforcement Learning DesignerDesign, train, and simulate reinforcement learning agents (Since R2021a)

Blocks

RL AgentReinforcement learning agent (Since R2019a)

Functions

expand all

rlQAgentQ-learning reinforcement learning agent (Since R2019a)
rlSARSAAgentSARSA reinforcement learning agent (Since R2019a)
rlDQNAgentDeep Q-network (DQN) reinforcement learning agent (Since R2019a)
rlPGAgentPolicy gradient (PG) reinforcement learning agent (Since R2019a)
rlACAgentActor-critic (AC) reinforcement learning agent (Since R2019a)
rlPPOAgentProximal policy optimization (PPO) reinforcement learning agent (Since R2019b)
rlTRPOAgentTrust region policy optimization (TRPO) reinforcement learning agent (Since R2021b)
rlDDPGAgentDeep deterministic policy gradient (DDPG) reinforcement learning agent (Since R2019a)
rlTD3AgentTwin-delayed deep deterministic (TD3) policy gradient reinforcement learning agent (Since R2020a)
rlSACAgentSoft actor-critic (SAC) reinforcement learning agent (Since R2020b)
rlQAgentOptionsOptions for Q-learning agent (Since R2019a)
rlSARSAAgentOptionsOptions for SARSA agent (Since R2019a)
rlDQNAgentOptionsOptions for DQN agent (Since R2019a)
rlPGAgentOptionsOptions for PG agent (Since R2019a)
rlACAgentOptionsOptions for AC agent (Since R2019a)
rlPPOAgentOptionsOptions for PPO agent (Since R2019b)
rlTRPOAgentOptionsOptions for TRPO agent (Since R2021b)
rlDDPGAgentOptionsOptions for DDPG agent (Since R2019a)
rlTD3AgentOptionsOptions for TD3 agent (Since R2020a)
rlSACAgentOptionsOptions for SAC agent (Since R2020b)
rlAgentInitializationOptionsOptions for initializing reinforcement learning agents (Since R2020b)
rlConservativeQLearningOptionsRegularizer options object to train DQN and SAC agents (Since R2023a)
rlBehaviorCloningRegularizerOptionsRegularizer options object to train DDPG, TD3 and SAC agents (Since R2023a)
rlMBPOAgentModel-based policy optimization (MBPO) reinforcement learning agent (Since R2022a)
rlMBPOAgentOptionsOptions for MBPO agent (Since R2022a)
getActorExtract actor from reinforcement learning agent (Since R2019a)
getCriticExtract critic from reinforcement learning agent (Since R2019a)
setActorSet actor of reinforcement learning agent (Since R2019a)
setCriticSet critic of reinforcement learning agent (Since R2019a)
getActionObtain action from agent, actor, or policy object given environment observations (Since R2020a)
rlReplayMemoryReplay memory experience buffer (Since R2022a)
rlPrioritizedReplayMemoryReplay memory experience buffer with prioritized sampling (Since R2022b)
rlHindsightReplayMemoryHindsight replay memory experience buffer (Since R2023a)
rlHindsightPrioritizedReplayMemoryHindsight replay memory experience buffer with prioritized sampling (Since R2023a)
appendAppend experiences to replay memory buffer (Since R2022a)
sampleSample experiences from replay memory buffer (Since R2022a)
resizeResize replay memory experience buffer (Since R2022b)
allExperiencesReturn all experiences in replay memory buffer (Since R2022b)
validateExperienceValidate experiences for replay memory (Since R2023a)
generateHindsightExperiencesGenerate hindsight experiences from hindsight experience replay buffer (Since R2023a)
getActionInfoObtain action data specifications from reinforcement learning environment, agent, or experience buffer (Since R2019a)
getObservationInfoObtain observation data specifications from reinforcement learning environment, agent, or experience buffer (Since R2019a)
resetReset environment, agent, experience buffer, or policy object (Since R2022a)

Topics

Agent Basics

Agent Types

Custom Agents