why can not output optimal solution when validate agent?
Afficher commentaires plus anciens
Hello everyone,
Topic: Reinforcement Learning, DQN Agent.
I have trained an agent with my dataset (total 28 training data) then validated all these data. Problem is i can not get optimal results at validation. Some of them were good but not every result was good.
- env: I custermized an environment.
- I create critic with this function: critic = rlVectorQValueFunction(nn,obsInfo,actInfo);
- With critic create an dqn agent: agent = rlDQNAgent(critic);
I have tried new agent with only 1 data. Training could get converged. Validation gave also right answer to this data. But i trained an agent with all 28 data using the same hyperparameter. Correctness is not garanteed.... I don't know what is reason. Because of too small dataset? or i gave wrong hyperparameter?
Hyperparameter of agent:
agent.AgentOptions.EpsilonGreedyExploration.EpsilonDecay = 0.9;
agent.AgentOptions.EpsilonGreedyExploration.Epsilon = 0.9;
agent.AgentOptions.EpsilonGreedyExploration.EpsilonMin = 0.001;
agent.AgentOptions.DiscountFactor = 0.99;
agent.AgentOptions.MiniBatchSize = 128;
agent.AgentOptions.CriticOptimizerOptions.LearnRate = 0.0008;
agent.AgentOptions.CriticOptimizerOptions.GradientThreshold = 1;
agent.AgentOptions.SaveExperienceBufferWithAgent=true;
Thank you
Kun
2 commentaires
Emmanouil Tzorakoleftherakis
le 13 Juin 2023
Are you using an IsDone signal? What do you mean by 28 training data? Do you mean 28 episodes? If that's the case, this number is really small. You need to at least give it a few hundred episodes to get an idea of how training progresses.
Réponse acceptée
Plus de réponses (0)
Catégories
En savoir plus sur Training and Simulation dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

