Effacer les filtres
Effacer les filtres

Training 4 TD3 RL Agents in Simulink to control Buck Converter . They need new observations, initialized from Buck Converter outputs. How to learn continuously from 1s to 5s?

6 vues (au cours des 30 derniers jours)
I am trying to train 4 TD3 RL Agents in the Simulink Environment. Each Agent is supposed to control the output voltage of a Buck Converter by sending its action signal to the input of the Buck Converter (as a reference voltage). For the sake of improving the learning process and enhancing the exploration of the agents, I want to initiate the environment such that at the beginning of each training episode, the agents observe a new set of observations. The issue is that all 9 elements of the observation vectors depend on the output voltages of the Buck Converters (the Actions). So I need to initialize the model at the beginning of each training episode by initializing the inputs of the Buck Converters, then as the agent starts sampling from the environment, replace the initilizing parameters with the actions of the agents. To implement that, I have put the RL Agent blocks in the Triggered Subsystems, and connected their outputs to Swich Blocks for alternating between the initializing parameter and the output of Triggered Subsystems (Action Signals). from the beginning of the episode till the second 1 of the simulation, the model gets initialize with the initializing parameter, then in the second 1, switch will work, and triggered subsystem will be activated. My question is: How can I modify my code so that the agents start the learning process from seconde 1 till the end of simulation time at seconde 5 (4 seconds of training for each episode)?
@ Tzorakoleftherakis I would greatly appreciate your kind help.
  4 commentaires
Emmanouil Tzorakoleftherakis
Assuming your triggered subsystem is set up properly, the only thing I can think of that's left is to make sure the episode duration/steps in training options accounts for the time the RL training is active/inactive.
mohsen
mohsen le 17 Juin 2024
I will now ensure that the episode duration/steps in the training options are appropriately adjusted to account for the active and inactive periods of the RL training. Appreciate your guidance!

Connectez-vous pour commenter.

Réponses (0)

Produits


Version

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by