I am new in using Matlab for solving reinforcement learning problem, and I am trying to follow up the example found in 'https://uk.mathworks.com/help/reinforcement-learning/ug/train-dqn-agent-to-balance-cart-pole-system.html'. Honestly, I used python to solve cart pole problem, and I understand the structure of deep QN fully. However, for Matlab, it confuses me entirely since I don't understand how neural of QN are arranged.
Since I know for Deep learning, you do have the input layer as the first layer then follow the hidden layer and lastly output layer. But for DQN here in Matlab what I see is states are input then follow some hidden layer and then another input which is action comes after specific layers, which are hidden layers. I don't understand this architecture, and I will appreciate it if someone would explain to me clearly. Also, if possible, the explanation and simple drawing DQN architecture with the state and action will be of great value.
There are various architectures you can use when setting up the Q-network. In the example you mentioned and most examples that have a Q-critic in Reinforcement Learning Toolbox the state and action path are separated. The reason is that you can architect these paths as necessary to extract useful features. For instance, in this example, one the state input is an image and the action is scalar torque. The image path needs to go through convolutional layers for example to extract features, but this is not necessary for a scalar input. This is why these paths are separated.
You can visualize neural networks in two ways:
figure
plot(criticNetwork)
or
deepNetworkDesigner
and load criticNetwork from workspace to see an interactive representation of the critic.
Impossible de terminer l’action en raison de modifications de la page. Rechargez la page pour voir sa mise à jour.
Translated by
Sélectionner un site web
Choisissez un site web pour accéder au contenu traduit dans votre langue (lorsqu'il est disponible) et voir les événements et les offres locales. D’après votre position, nous vous recommandons de sélectionner la région suivante : .
Vous pouvez également sélectionner un site web dans la liste suivante :
Comment optimiser les performances du site
Pour optimiser les performances du site, sélectionnez la région Chine (en chinois ou en anglais). Les sites de MathWorks pour les autres pays ne sont pas optimisés pour les visites provenant de votre région.