QTable reset when using train

Question

0 votes

Hi,

I am using the Matlab Reinforcement Learning toolbox to train an rlQAgent.

The issue that I am facing is that the corresponding QTable, i.e., the output of the command getLearnableParameters(getCritic(qAgent)), is reset each time the train command is used.

Is it possible to avoid this reset so to train further a previously trained agent?

Thank you

Corrado

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis le 19 Mai 2020

Modifié(e) : Emmanouil Tzorakoleftherakis le 20 Mai 2020

Ouvrir dans MATLAB Online

0 votes

If you stop training, you should be able to continue from where you left off. I called 'train' on the basic grid world example a couple of times in a row and the output of 'getLearnableParameters(getCritic(qAgent))' was different. You can always save the trained agent and reload it as well to make sure you don't accidentally delete it.

Update:

There is a regularization term added to the loss which causes the other entries to change slightly. To avoid this, you can type:

qRepresentation.Options.L2RegularizationFactor=0;

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Corrado Possieri le 20 Mai 2020

Modifié(e) : Corrado Possieri le 20 Mai 2020

Ouvrir dans MATLAB Online

I am actually traying to set the initial Qtable for the agent.

If I run the code

env = rlPredefinedEnv("BasicGridWorld");
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
qTable.Table = randn(size(qTable.Table));
qRepresentation = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
agentOpts = rlQAgentOptions;
agentOpts.DiscountFactor = 1;
qAgent = rlQAgent(qRepresentation,agentOpts);
trainOpts = rlTrainingOptions;
trainOpts.Plots = 'none';
trainOpts.MaxEpisodes = 1;
trainOpts.MaxStepsPerEpisode = 1;
trainOpts.Verbose = 1;
QTable0 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable1 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable2 = getLearnableParameters(getCritic(qAgent));
disp(find(QTable0{1} ~= QTable1{1}))
disp(find(QTable1{1} ~= QTable2{1}))

I get what I expect, that is just one and two entries of the QTable are changed.

However, if I try to force the initial value of the QTable

env = rlPredefinedEnv("BasicGridWorld");
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
qTable.Table = randn(size(qTable.Table));
qRepresentation = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
agentOpts = rlQAgentOptions;
agentOpts.DiscountFactor = 1;
qAgent = rlQAgent(qRepresentation,agentOpts);
trainOpts = rlTrainingOptions;
trainOpts.Plots = 'none';
trainOpts.MaxEpisodes = 1;
trainOpts.MaxStepsPerEpisode = 1;
trainOpts.Verbose = 1;
QTable0 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable1 = getLearnableParameters(getCritic(qAgent));
train(qAgent,env,trainOpts);
QTable2 = getLearnableParameters(getCritic(qAgent));
disp(find(QTable0{1} ~= QTable1{1}))
disp(find(QTable1{1} ~= QTable2{1}))

all its entries are perturbed as if the QTable is somehow reinitialized.

Emmanouil Tzorakoleftherakis le 20 Mai 2020

Updated my answer above with a solution - hope that helps.

Corrado Possieri le 20 Mai 2020

Thank you Emmanouil, this solved the issue.

Connectez-vous pour commenter.

QTable reset when using train

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Community Treasure Hunt

QTable reset when using train

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

5 commentaires Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Plus de réponses (0)

Catégories

Tags

Voir également

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens