Is it possible to change RL action values under certain conditions?
Afficher commentaires plus anciens
I want my agent to output a target value, but in certain situations (reward drops dramatically), I would want the agent to look for a better solution by letting him change the target value. I tried to use initial condition block in order to use the target value in the first place. However, my agent (PPO) always outputs an average value after some training episodes.
5 commentaires
Emmanouil Tzorakoleftherakis
le 18 Mai 2021
Can you provide some more information? What do you mean by letting the agent change target value? Isn't that what is happening by default every time the agent takes an action? what is the envronment architecture?
black_cat
le 18 Mai 2021
Emmanouil Tzorakoleftherakis
le 19 Mai 2021
thanks. It's still not clear to me what you mean by "However, this results in having an output of 3 since the agent is averaging it during training". If it's best to output a 6, the agent should do so, why would it average the output? Unless you are talking about the average episode reward that you see in the episode manager?
Réponses (0)
Catégories
En savoir plus sur Reinforcement Learning Toolbox dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!