solve critic overestimate and how to explore specific action range

1 vue (au cours des 30 derniers jours)

dani ansari le 26 Sep 2023

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/2025927-solve-critic-overestimate-and-how-to-explore-specific-action-range

hello

im using a ddpg agent to tune a robot controller.all of my rewards are negetive and my critic learning rate is 0.01 and my actor learning rate is 0.0001 with adan optimizer and my gradient tresholds are 1. i have tow questions :

1- when my action ange is between [0.00001 0.2] my q0 predict a negetive value too(although with a large bias over actual value) but when my action range is between[0.00001 0.5] my critic have large overstimating around big positive values. why this happen with using bigger action range?

2- i define my action range between [0.00001 0.5] but i know my best action sit somewhere about [0.1 0.2] most of the time. how should i define my actor to explore this range more? is this related to noise option? how should i define ornstein-ohlenbeck noise option to explore this area?