How to to updata episodes number?
2 vues (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I am making code by reinforcement learning.
The purpose of reinforcement learning describes a simple pendulum that throws a ball at a target point.
However, the figure below shows the learning situation.
I feel that there is a problem with the episode reward.
Is this because the episodes haven't been updated, that is, the observations haven't been updated?
Or is there some other cause?
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/707987/image.png)
Below is the code for the update of the observed values.
function [Observation,Reward,IsDone,LoggedSignals] = step(this,Action)
LoggedSignals = [];
Force = getForce(this,Action); % torque
theta = this.State(1); % state is pendulum's theta (angular)
w = this.State(2); % w is Angular velocity of the pendulum
IsDone = false;
R = 0;
% pendulum dynamics Euler method
q2 = w - (this.g/this.L) *theta*this.Ts- this.b * this.Ts-Force*this.Ts; % angular velocity
q1 = theta + w * this.Ts; % angular
% ball dynamics
ball_x = this.L * sin(q1); % x initial position of ball
ball_y = -this.L * cos(q1); % y initial position of ball
ball_time = sqrt(2*abs(ball_y)/9.8); % reaching time of ball
ball_reach = ball_x +abs(q2).*ball_time; % Horizontal ball flight distance
ball_gosa = ball_reach-this.Target; % Difference between target point and flight distance
q3 = ball_gosa;
% condition of reward
% If the difference between the target point and the flight distance is 1 or less, a reward will be given.
if 0 < q3 && q3 < 1
IsDone = true;
R = this.RewardForStrike;
else
R = this.RewardForNotFalling;
end
Observation = [q1 q2 q3 Force]'; % observation states
this.State = Observation;
this.IsDone = IsDone;
Reward = getReward(this,R);
notifyEnvUpdated(this);
end
0 commentaires
Réponses (0)
Voir également
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!