My DDPG agent model is generating same output from every simulation.
Afficher commentaires plus anciens
I trained a rlDDPGagent(a biped walking robot) and simulated, but it always generates a same walking from every simulation.
However, I need a different gaits to acquire several sensor data. The code below is the training and agent option code. I used the msra-walking-robot-master code from matlab github.
% Create DDPG agent and training options for walking robot example
%
% Copyright 2019 The MathWorks, Inc.
%% DDPG Agent Options
agentOptions = rlDDPGAgentOptions;
agentOptions.SampleTime = Ts;
agentOptions.DiscountFactor = 0.99;
agentOptions.MiniBatchSize = 128;
agentOptions.ExperienceBufferLength = 1e6;
agentOptions.TargetSmoothFactor = 1e-3;
agentOptions.NoiseOptions.MeanAttractionConstant = 5;
agentOptions.NoiseOptions.Variance = 0.4;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-5;
%% Training Options
trainingOptions = rlTrainingOptions;
trainingOptions.MaxEpisodes = 5000;
trainingOptions.MaxStepsPerEpisode = Tf/Ts;
trainingOptions.ScoreAveragingWindowLength = 100;
trainingOptions.StopTrainingCriteria = 'AverageReward';
trainingOptions.StopTrainingValue = 110;
trainingOptions.SaveAgentCriteria = 'EpisodeReward';
trainingOptions.SaveAgentValue = 150;
trainingOptions.Plots = 'training-progress';
trainingOptions.Verbose = true;
if useParallel
trainingOptions.Parallelization = 'async';
trainingOptions.ParallelizationOptions.StepsUntilDataIsSent = 32;
end
The code below is training code:
% Walking Robot -- DDPG Agent Training Script (2D)
% Copyright 2019 The MathWorks, Inc.
warning off parallel:gpu:device:DeviceLibsNeedsRecompiling %don't show the warning
%% SET UP ENVIRONMENT
% Speedup options
useFastRestart = true;
useGPU = false;
useParallel = true;
% Create the observation info
numObs = 31;
observationInfo = rlNumericSpec([numObs 1]);
observationInfo.Name = 'observations';
% create the action info
numAct = 6;
actionInfo = rlNumericSpec([numAct 1],'LowerLimit',-1,'UpperLimit', 1);
actionInfo.Name = 'foot_torque';
% Environment
mdl = 'walkingRobotRL2D';
load_system(mdl);
blk = [mdl,'/RL Agent'];
env = rlSimulinkEnv(mdl,blk,observationInfo,actionInfo);
env.ResetFcn = @(in)walkerResetFcn(in,upper_leg_length/100,lower_leg_length/100,h/100,'2D');
if ~useFastRestart
env.UseFastRestart = 'off';
end
%% CREATE NEURAL NETWORKS
createDDPGNetworks;
%% CREATE AND TRAIN AGENT
createDDPGOptions;
agent = rlDDPGAgent(actor,critic,agentOptions);
trainingResults = train(agent,env,trainingOptions)
%% SAVE AGENT
reset(agent); % Clears the experience buffer
curDir = pwd;
saveDir = 'savedAgents';
cd(saveDir)
save(['trainedAgent_2D_' datestr(now,'mm_DD_YYYY_HHMM')],'agent','trainingResults','trainingOptions.MaxEpisodes');
cd(curDir)
The code below is the simulation code:
% Simulates the walking robot model
%% Setup
clc; close all;
robotParametersRL
% Create the observation info
numObs = 31;
observationInfo = rlNumericSpec([numObs 1]);
observationInfo.Name = 'observations';
% create the action info
numAct = 6;
actionInfo = rlNumericSpec([numAct 1],'LowerLimit',-1,'UpperLimit', 1);
actionInfo.Name = 'foot_torque';
% Environment
mdl = 'walkingRobotRL2D';
load_system(mdl);
blk = [mdl,'/RL Agent'];
env = rlSimulinkEnv(mdl,blk,observationInfo,actionInfo);
load trainedAgent_2D_04_25_2024_1541_5000 %load agent
%action = getAction(agent);
simOpts = rlSimulationOptions;
simOpts.MaxSteps = 1000;
simOpts.NumSimulations = 3;
%plot(env);
reset(env);
experience = sim(env,agent,simOpts);
The result of the code always show the same gait. Is there any method to get different output from every simulation?
Thank you so much!
Réponse acceptée
Plus de réponses (1)
Mubashir Rasool
le 18 Mai 2024
1 vote
I am facing the same issue that my DPPG agent is not generating desired results. I thhink the main thing in DDPG design is reward function selection. Lets wait for some professional response
Catégories
En savoir plus sur Reinforcement Learning dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!