Reinforcement Learning algorithm to tune PID parameters of a system

56 vues (au cours des 30 derniers jours)
Bipratip
Bipratip le 1 Sep 2023
  • Hello, I try to tune a PID controller for a second order system model using Reinforcement Learning like the example of Tune PI Controller using Reinforcement Learning in MATLAB ( https://it.mathworks.com/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html )
  • I created the system model with PID called “System_PID.slx” and simulated it to get satisfactory result.
  • I also created the system model with RL_agent block “ rl_PID_Tune_mod .slx”
  • I am running the code for 100 iterations.
  • The ‘Agent’ loaded in workspace after training also gives satisfactory performance when it is considered as standalone controller.
  • But, the value of kp, ki, kd are also generated but the value of proportional gain (kp) which is extracted from training, is very small nearly zero. As a result, the extracted Kp, Ki and Kd from ‘Agent’ are not giving the satisfactory result.
  • That’s why the system is going to be unstable
  • So where is the problem in the configuration of the code as given below.
  • Any help is highly obelized.
simOpts = rlSimulationOptions('MaxSteps',maxsteps);
experiences = sim(env,agent,simOpts);
actor = getActor(agent);
parameters = getLearnableParameters(actor);
Ki = abs(parameters{1}(1))
Kp = abs(parameters{1}(2))
Kd = abs(parameters{1}(3))

Réponses (1)

Sam Chak
Sam Chak le 1 Sep 2023
I didn't check everything, but I noticed that you reused the same quadratic cost function (Reward in RL terms) from the 'rlwatertankPIDTune.slx' example for your second-order plant, . The water level control system in a tank is a first-order system. In the example, the cost is chosen as . where the weighing factors and , and a PI controller is sufficient.
Since you want to tune a PID controller for a second-order system using RL, perhaps it is appropriate to define the cost as where and . One challenge is that you need to estimate because only the output of the second-order system is measurable.
a = 1; % tf numerator
b = [1 1 0]; % tf denominator
% [A, B, C, D] = tf2ss(a, b)
sys = ss(tf(a, b)) % state-space
sys = A = x1 x2 x1 -1 0 x2 1 0 B = u1 x1 1 x2 0 C = x1 x2 y1 0 1 D = u1 y1 0 Continuous-time state-space model.
In continuous-time and in the absence of Gaussian noise, the PID controller gains may be tuned as follows:
Kp = 0.75;
Ki = 0;
Kd = 0.5;
N = 3;
Gpid = pid(Kp, Ki, Kd, 1/N)
Gpid = s Kp + Kd * -------- Tf*s+1 with Kp = 0.75, Kd = 0.5, Tf = 0.333 Continuous-time PDF controller in parallel form.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by