Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689) (By RL toolbox)
Afficher commentaires plus anciens
I want to creat the multi-discrete actor outputs.
It will be like delta1 output 1 or 0, and delta2 is the same.
but there comes the error
Error using rl.env.AbstractEnv/simWithPolicy (line 70)
An error occurred while simulating "quarter_car" with the agent "agent".
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Invalid input argument type or size such as observation, reward, isdone or loggedSignals.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
Unable to evaluate representation.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 689)
The logical indices contain a true value outside of the array bounds.
I don't understand the error is cause by the code or the simulink, and how to fix it.

% create observation info
observationInfo = rlNumericSpec([numObs 1],'LowerLimit',-inf*ones(numObs,1),'UpperLimit',inf*ones(numObs,1));
observationInfo.Name = 'observation';
% create action Info
actionInfo = rlFiniteSetSpec({[0;0],[1;1]});
actionInfo.Name = 'actor';
% define environment
env = rlSimulinkEnv(mdl,agentblk,observationInfo,actionInfo);
rng(0)
actorNetwork = [
imageInputLayer([numObs 1 1],'Normalization','none','Name','observation')
fullyConnectedLayer(200,'Name','ActorFC1')
reluLayer('Name','ActorRelu1')
fullyConnectedLayer(150,'Name','ActorFC2')
reluLayer('Name','ActorRelu2')
fullyConnectedLayer(numAct,'Name','ActorFC3')
tanhLayer('Name','ActorTanh')];
actorOpts = rlRepresentationOptions('LearnRate',1e-3,'GradientThreshold',1);
actor= rlStochasticActorRepresentation(actorNetwork, obsInfo, actInfo, 'Observation', {'observation'}, actorOpts);
agentOpts = rlPPOAgentOptions(...
'ExperienceHorizon',600,...
'ClipFactor',0.02,...
'EntropyLossWeight',0.01,...
'MiniBatchSize',128,...
'NumEpoch',3,...
'AdvantageEstimateMethod','gae',...
'GAEFactor',0.95,...
'SampleTime',h,...
'DiscountFactor',0.997);
agent = rlPPOAgent(actor,critic,agentOpts);
Réponses (1)
Emmanouil Tzorakoleftherakis
le 29 Nov 2020
Modifié(e) : Emmanouil Tzorakoleftherakis
le 1 Déc 2020
0 votes
Hello,
Based on the attached files, it seems like you are creating a PPO agent but you are creating a Q network for a critic. If you look at this page, PPO implementation in Reinforcement Learning Toolbox requires a V critic. If you change your critic network to be equivalent to, e.g., this example, the errors go away.
Hope that helps
4 commentaires
Hong-Ruei Ciou
le 30 Nov 2020
Emmanouil Tzorakoleftherakis
le 30 Nov 2020
You don't need to worry about loggedSignals here. I cannot see anything obvious, if you share a reproduction model I can take a look.
Hong-Ruei Ciou
le 1 Déc 2020
Emmanouil Tzorakoleftherakis
le 1 Déc 2020
Edited my response above.
Catégories
En savoir plus sur Reinforcement Learning dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!