REINFORCE algorithm- unable to compute gradients on latest toolbox version

I have been trying to implement the REINFORCE algorithm using custom training loop.
The LSTM actor network inputs 50 timestep data of three states. Therefore a state is of dimension 3x50.
For computing gradients, the input data in the forllowing format
num_states x batchsize x N_TIMESTEPS = (3x1)x50x50.
In Reinforcement Learning toolbox version 1.3, the following line works perfectly.
% actor- the custom actor network , actorLossFunction- custom loss fn, lossData- custom variable
actorGradient = gradient(actor,@actorLossFunction,{reshape(observationBatch,[3 1 50 50])},lossData);
However, when I run the same code in the latest RL toolbox version 2.2, I get the following error:
------------------------------------------------------------------------------------------------------------------------------------------------------
Error using rl.representation.rlAbstractRepresentation/gradient
Unable to compute gradient from representation.
Error in simpleRLTraj (line 184)
actorGradient= gradient(actor,@actorLossFunction,{reshape(observationBatch,[3 1 50 50])},lossData);
Caused by:
Error using extractBinaryBroadcastData
dlarray is supported only for full arrays of data type double, single, or logical, or for full gpuArrays of
these data types.
------------------------------------------------------------------------------------------------------------------------------------------------------
I tried tracing back to the error but it get more complicated. How do I get an error for a code that works perfectly on the earlier version of RL toolbox?

 Réponse acceptée

Joss Knight
Joss Knight le 5 Avr 2022
Modifié(e) : Joss Knight le 5 Avr 2022
What is
underlyingType(observationBatch)
underlyingType(lossData)
?

5 commentaires

underlyingType(observationBatch)
ans =
'double'
underlyingType(lossData)
ans =
'struct'
---------------------------------------------------------------
these are underlying types of both the variables.
Thanks. Looks okay. Have you double-checked that this example hasn't been updated for MATLAB R2022a? Sometimes examples are updated to take advantage of new features or changes in behaviour and old versions no longer function correctly.
Thank you for the assistance so far. Yes, the examples are updated. I wrote the code initially in R2020b version and it worked perfectly fine.
However, after updating the software to R2022a, the code threw some errors.
Upon checking the latest example, I found that the function handle (@actorLossFunction) can be passed directly to the gradient() function . I made the necessary changes. Yet the error still persists.
This is the R2020b version of the same example - R2020b REINFORCE
Can you attached your script so we can better help?
I found the issue. Apparently, the output of the neural network is a cell array and not a double type.
As a result of some sort of typecasting, the loss was of type cell array.
I found that we cannot convert a cell type to dlarray type using the dlarray() function which must have been used somewhere internally in the gradient() function.
example-
dlarray({3})
Error using dlarray
dlarray is supported only for full arrays of data type double, single, or logical, or for full gpuArrays of these data types.
I have resolved the error. Thank you for helping me realize this.

Connectez-vous pour commenter.

Plus de réponses (0)

Produits

Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by