How to input action in reinforcement learning template environment?
Afficher commentaires plus anciens
I have modified the template environment to adapt my scenarios. My current action cosists of two vectors. The Action configuration is like the following.
function this = EdgeEnvironment()
% Initialize Observation settings
ObservationInfo(1) = rlNumericSpec([1 10]);
ObservationInfo(1).Name = 'schedule';
ObservationInfo(1).Description = 'schedule';
ObservationInfo(2) = rlNumericSpec([1 20]);
ObservationInfo(2).Name = 'ppath';
ObservationInfo(2).Description = 'ppath';
ObservationInfo(3) = rlNumericSpec([1 1]);
ObservationInfo(3).Name = 'completionTime';
ObservationInfo(3).Description = 'completionTime';
ObservationInfo(4) = rlNumericSpec([1 1]);
ObservationInfo(4).Name = 'computeDuring';
ObservationInfo(4).Description = 'computeDuring';
% Initialize Action settings
ActionInfo(1) = rlNumericSpec([1 10]);
ActionInfo(1).Name = 'schedule';
ActionInfo(2) = rlNumericSpec([1 20]);
ActionInfo(2).Name = 'ppath';
% The following line implements built-in functions of RL env
this = this@rl.env.MATLABEnvironment(ObservationInfo, ActionInfo);
end
The step function was designed like the following.
function [Observation,Reward,IsDone,LoggedSignals] = step(this, Action)
LoggedSignals = [];
% distance
node_distance = zeros(this.device_count, this.device_count);
distance = getDistance(this, node_distance);
% parameter list
parameter_list = getstruct(this, distance);
% the parameter list of device
device_list = get_device_list(this);
% Extract action
[schedule_act, ppath_act]=get_act(Action);
% schedule_act = Action{1,1};
% ppath_act = Action{1,2};
% Unpack state vector
last_schedule = schedule_act;
last_ppath = ppath_act;
last_completionTime = this.State{1,3};
last_computeDuring = this.State{1,4};
% Update system states
[schedule, stay_node_list, completionTime] = ComScheduling(last_completionTime,...
last_schedule, last_ppath, device_list, parameter_list);
[ppath, stay_node_list, completionTime, computeDuring] = PathPlanning(last_completionTime,...
last_ppath, schedule, stay_node_list, device_list, parameter_list);
prob = 1 / (1 + exp((completionTime - last_completionTime)/parameter_list.omega));
dice = rand(1);
if dice <= prob
last_ppath = ppath;
last_schedule = schedule;
last_stay_node_list = stay_node_list;
last_completionTime = completionTime;
last_computeDuring = computeDuring;
completionTime_iter(end + 1) = completionTime;
else
completionTimer_iter(end + 1) = last_computeDuring;
end
ppath = last_ppath;
schedule = last_schedule;
stay_node_list = last_stay_node_list;
completionTime = last_completionTime;
computeDuring = last_computeDuring;
Observation = {schedule, ppath, completionTime, computeDuring};
this.State = Observation;
% Check terminal condition
completionTime = Observation(3);
computeDuring = Observation(4);
IsDone = completionTime < this.completionTime_threshold || computeDuring < this.computeDuring_threshold;
this.IsDone = IsDone;
% Get reward
Reward = -completionTime;
end
We caculate the action value by the following function.
function [schedule_act, ppath_act] = get_act(action)
schedule_act = action{1,1};
ppath_act = action{1,2};
end
When I run the validateEnvironment function, the error is like the following.

I want to know how to fix them.
Réponse acceptée
Plus de réponses (0)
Catégories
En savoir plus sur Environments dans Centre d'aide et File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
