<DRL_PPO>erro:Number of input layers for deep neural network must equal to number of observation specifications.

8 vues (au cours des 30 derniers jours)
Hi guys
I have been using a Reinforcement learning toolbox recently and I am using PPO algorithm to train an agent in a custom environment.
Then, I encountered an error, my code and debug log are as follows, sincerely hope someone can help me, very grateful.
The debug logs are as follows:
error using rl.internal.validate.mapFunctionObservationInput
Number of input layers for deep neural network must equal to number of observation specifications.
error in rlValueFunction (line 92)
modelInputMap = rl.internal.validate.mapFunctionObservationInput(model,observationInfo,nameValueArgs.ObservationInputNames);
error in ppo0320 (第 53 行)
critic= rlValueFunction(criticdlnet1,obsInfo);
My code is as follows
slx = 'RLcontrolstrategy0312';
open_system(slx);
agentblk = slx +"/agent";
%obsInfo actInfo(The problem might be here, right?)
numObs=49;
obsInfo=rlNumericSpec([49,1],'LowerLimit',0, 'UpperLimit',1);
numAct=6;
actInfo = rlNumericSpec([6,1], 'LowerLimit',[0 0 0 -1 -1 -1]','UpperLimit',[1 1 1 1 1 1]');
scale = [0.5 0.5 0.5 1 1 1]';
bias = [0.5 0.5 0.5 0 0 0]';
env = rlSimulinkEnv(slx,agentblk,obsInfo,actInfo);
Ts = 0.001;
Tf = 4;
rng(0)
%critic
cnet = [
featureInputLayer(9,"Normalization","none","Name","name1")
fullyConnectedLayer(256,"Name","fc1")
concatenationLayer(1,2,"Name","concat")
tanhLayer("Name","tanh1")
fullyConnectedLayer(256,"Name","fc2")
tanhLayer("Name","tanh2")
fullyConnectedLayer(128,"Name","fc3")
tanhLayer("Name","tanh3")
fullyConnectedLayer(64,"Name","fc4")
tanhLayer("Name","tanh4")
fullyConnectedLayer(64,"Name","fc5")
tanhLayer("Name","tanh5")
fullyConnectedLayer(1,"Name","CriticOutput")];
cnetMC=[
featureInputLayer(40,"Normalization","none","Name","name2")
fullyConnectedLayer(512,"Name","fc11")
tanhLayer("Name","tanh13")
fullyConnectedLayer(128,"Name","fc14")
tanhLayer("Name","tanh14")
fullyConnectedLayer(64,"Name","fc15")];
criticNetwork = layerGraph(cnet);
criticNetwork = addLayers(criticNetwork, cnetMC);
criticNetwork = connectLayers(criticNetwork,"fc15","concat/in2");
criticdlnet = dlnetwork(criticNetwork,'Initialize',false);
criticdlnet1 = initialize(criticdlnet);
%(The problem might be here, right?)
critic= rlValueFunction(criticdlnet1,obsInfo);
%actor
anet = [
featureInputLayer(9,"Normalization","none","Name","name1")
fullyConnectedLayer(256,"Name","fc1")
concatenationLayer(1,2,"Name","concat")
tanhLayer("Name","tanh1")
fullyConnectedLayer(256,"Name","fc2")
tanhLayer("Name","tanh2")
fullyConnectedLayer(128,"Name","fc3")
tanhLayer("Name","tanh3")
fullyConnectedLayer(64,"Name","fc4")
tanhLayer("Name","tanh4")];
anetMC=[
featureInputLayer(40,"Normalization","none","Name","name2")
fullyConnectedLayer(512,"Name","fc11")
tanhLayer("Name","tanh13")
fullyConnectedLayer(128,"Name","fc14")
tanhLayer("Name","tanh14")
fullyConnectedLayer(64,"Name","fc15")];
meanPath = [
fullyConnectedLayer(64,"Name","meanFC")
tanhLayer("Name","tanh5")
fullyConnectedLayer(numAct,"Name","mean")
tanhLayer("Name","tanh6")
scalingLayer(Name="meanPathOut",Scale=scale,Bias=bias)];
stdPath = [
fullyConnectedLayer(64,"Name","stdFC")
tanhLayer("Name","tanh7")
fullyConnectedLayer(numAct,"Name","fc5")
softplusLayer("Name","std")];
actorNetwork = layerGraph(anet);
actorNetwork = addLayers(actorNetwork,anetMC);
actorNetwork = connectLayers(actorNetwork,"fc15","concat/in2");
actorNetwork = addLayers(actorNetwork,meanPath);
actorNetwork = addLayers(actorNetwork,stdPath);
actorNetwork = connectLayers(actorNetwork,"tanh4","meanFC/in");
actorNetwork = connectLayers(actorNetwork,"tanh4","stdFC/in");
actordlnet = dlnetwork(actorNetwork);
%(The problem might be here, right?)
actor = rlContinuousGaussianActor(actordlnet,obsInfo,actInfo, ...
"ActionMeanOutputNames","meanPathOut", ...
"ActionStandardDeviationOutputNames","std");
% Agent hyperparameters
agentOptions=rlPPOAgentOptions("SampleTime",Ts,"DiscountFactor",0.995,"ExperienceHorizon",1024,"MiniBatchSize",512,"ClipFactor",0.2, ...
"EntropyLossWeight",0.01,"NumEpoch",8,"AdvantageEstimateMethod","gae","GAEFactor",0.98, ...
"NormalizedAdvantageMethod","current");
agentOptions.ActorOptimizerOptions=rlOptimizerOptions("LearnRate",0.0001,"GradientThreshold",1, ...
"L2RegularizationFactor",0.0004,"Algorithm","adam");
agentOptions.CriticOptimizerOptions=rlOptimizerOptions("LearnRate",0.0001,"GradientThreshold",1, ...
"L2RegularizationFactor",0.0004,"Algorithm","adam");
%Creating an Agent
agent=rlPPOAgent(actor,critic,agentOptions);
%training
trainOptions=rlTrainingOptions("StopOnError","on", "MaxEpisodes",2000,"MaxStepsPerEpisode",floor(Tf/Ts), ...
"ScoreAveragingWindowLength",10,"StopTrainingCriteria","AverageReward", ...
"StopTrainingValue",100000,"SaveAgentCriteria","None", ...
"SaveAgentDirectory","D:\car\jianmo\zhangxiang\agent","Verbose",false, ...
"Plots","training-progress");
doTraining = true;
if doTraining
trainingStats = train(agent,env,trainOptions);
else
load('agent.mat','agent')
end

Réponses (1)

Shivansh
Shivansh le 5 Avr 2024
Hi,
The error message you're encountering, "Number of input layers for deep neural network must equal to number of observation specifications," suggests that there's a mismatch between the number of input layers defined in your neural networks (actor and critic) and the number of observations specified for your environment.
It looks like you have defined the observation space (obsInfo) with a dimension of [49,1], indicating that the environment will provide observations as 49-dimensional vectors. However, when constructing the neural networks for both the critic and actor, you have input layers (featureInputLayer) with dimensions that do not match this specification.
You need to make sure that the input layers of your networks match the dimensionality of the observation space provided by the environment. Since your observation space is 49-dimensional, your network should start with an input layer that matches this dimension if you're processing the observations as a whole in the network.
You can refer to the following documentation more information on the Reinforcement Learning toolbox in MATLAB:
I hope it helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by