MATLAB Answers

0

Reinforcement Learning - Multiple Discrete Actions

I would like to use a DQN agent where I have multiple continuous states (or observations) and two action signals, each with three possible values for a total of 9 combinations. For example, see the next lines to understand what I mean:
a = [-2,0,2];
b = [-3,0,3];
[A,B] = meshgrid(a,b);
actions = reshape(cat(2,A',B'),[],2);
If I want to create discrete actions, I need to convert the matrix into a cell and run the command:
actionInfo = rlFiniteSetSpec(num2cell(actions,2));
actionInfo.Name = 'actions';
Additionally, in DQN, you have a critic, which comprises of a deep neural network. I have created the critic as follows:
% Create a DNN for the critic:
hiddenLayerSize = 48;
observationPath = [
imageInputLayer([numObs 1 1],'Normalization','none',...
'Name','observation')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticStateFC1')
reluLayer('Name','CriticReLu1')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticStateFC2')
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonReLu1')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticCommonFC1')
reluLayer('Name','CriticCommonReLu2')
fullyConnectedLayer(1,'Name','CriticOutput')];
actionPath = [
imageInputLayer([value 1 1],'Normalization','none','Name','action')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticActionFC1')];
% Create the layerGraph:
criticNetwork = layerGraph(observationPath);
criticNetwork = addLayers(criticNetwork,actionPath);
% Connect actionPath to obervationPath:
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
% Specify options for the critic representation:
criticOpts = rlRepresentationOptions('LearnRate',1e-03,...
'GradientThreshold',1,'UseDevice','gpu');
% Create the critic representation using the specified DNN and options:
critic = rlRepresentation(criticNetwork,observationInfo,actionInfo,...
'Observation',{'observation'},'Action',{'action'},criticOpts);
% set the desired options for the agent:
agentOptions = rlDQNAgentOptions(...
'SampleTime',dt,...
'UseDoubleDQN',true,...
'TargetSmoothFactor',1e-3,...
'DiscountFactor',0.99,...
'ExperienceBufferLength',1e7,...
'MiniBatchSize',128);
My problem is the first image input layer to the action path imageInputLayer([value 1 1],'Normalization','none','Name','action'). I have tried values of 1, 2, 9 and 18 for value, but all results in an error when I run
agent = rlDQNAgent(critic,agentOptions);
This is because actionInfo has a cell of 9 elements, each with a double vector of dimensions [1,2], whereas the imageInputLayer is expecting dimensions [value,1,1].
So, how can I set up a DQN agent in MATLAB with two main discrete action signals, each with three possible values?
Many thanks in advance for the help!

  2 Comments

Hey,
I am not sure If I should open a new thread for this but since it is very close to this question I will try to ask here first.
I am trying to use the PG Agent with multiple discrete Actions and I have no idea how my last Layer of the action Network should look like.
I have [2,62] Actions (2 Parameters with each 62 discrete states) and the output layer only accepts a positiv integer and not vectors. I have tried 2 for the number of parameters and 124 for the number of possible actions. Both get me the same error:
Error using categorical (line 337)
Could not find unique values in VALUESET using the UNIQUE function.
Error in rl.util.rlLayerRepresentation/buildNetwork (line 719)
categorical(ActionValues, ActionValues);
Error in rl.util.rlLayerRepresentation/setLoss (line 175)
this = buildNetwork(this);
Error in rl.agent.rlPGAgent/setActorRepresentation (line 339)
actor = setLoss(actor,'cte','EntropyLossWeight',opt.EntropyLossWeight);
Error in rl.agent.rlPGAgent (line 47)
this = setActorRepresentation(this,actor,opt);
Error in rlPGAgent (line 21)
Agent = rl.agent.rlPGAgent(varargin{:});
Error in DQN (line 67)
agent = rlPGAgent(actor,baseline,agentOpts);
Caused by:
Error using cell/unique (line 85)
Cell array input must be a cell array of character vectors.
I have attached the file to this Comment.
Sorry, but I have just seen this.
Do you have 62 states and 2 actions? Or 2 states, 62 actions? Or 124 actions?
I would not recommend a large number of actions, as it will cause learning problems.

Sign in to comment.

1 Answer

 Accepted Answer

Hi Enrico,
Try
actionPath = [
imageInputLayer([1 2 1],'Normalization','none','Name','action')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticActionFC1')];
Each action in your code is 1x2, which should be reflected in the dimensions of the actionpath input.
I hope this helps.

  1 Comment

Hi Emmanouil ,
many thanks for the help!
It works great. I would add that then you need a reshape block in Simulink, but that is no problem. It is much faster than a mapping with an additional C-coded S-function.

Sign in to comment.