How can I simulate Direct ADP?

Aidin

10 Sep 2021

0 Réponses

Mise à jour 10 Sep 2021

2 Vues (30 jours)

Connectez-vous pour répondre à cette question.

Follow Question

Connectez-vous pour répondre à cette question.

Follow Question

Afficher commentaires plus anciens

0 votes

I want to simulate this article ( On-Line Learning Control by Association and Reinforcement ) but I have a problem in obtaining optimum weight for critic neural network. The critic neural network error in this article is [ e_c = J(t) - (J(t-1) - r(t)) ] , and at the begining the critic weights are selected randomly. My question is that, at the begining we dont have any J(t-1) and also we know that J(t) and r(t) are positive functions, so if we consider J(t-1) = 0, then J(t) will converg to -r(t) and become a negative number that is false.