Finding Optimal ID, FD and Hidden Nodes for NARXNET

Question

Teck Kong Chong le 21 Juil 2016

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/296646-finding-optimal-id-fd-and-hidden-nodes-for-narxnet

Commenté : ali aboutorabi le 15 Déc 2020

Hi, After a lengthy research, I have finally have better understanding about ID and FD. I have then put some code together to find the optimal ID and FD and then using these ID and FD to find optimal hidden node for my NARXNet using simplenarx dataset. While I am convince that it is correct but I am not very confident if this is the correct way of doing things so I would really appreciate any comments/correction if any.

Some additional question are: 1) Should I use data division such as 60/20/20 in the double for loop? 2) I used intersect command to find the subset of lags, is this correct way of doing it?

Below are my codes:

if true
  % code
end
close all, clear all, clc, 
tic
plt=0;
[X,T] = simplenarx_dataset; % simplenarx_dataset;
x = cell2mat(X);
t = cell2mat(T);
[ I N ] = size(x); % [ 1 100 ]
[ O N ] = size(t); % [ 1 100 ]
MSE00 = var(t',1) % 0.1021
% Calculate Z-Score for input (x) and target (t)
zx = zscore(x, 1);
zt = zscore(t, 1);
% Plot Input & Output for both original and transformed (Z-scored)
plt = plt+1,figure(plt);
subplot(221)
plot(x)
title('SIMPLENARX INPUT SERIES')
subplot(222)
plot(zx)
title('STANDARDIZED INPUT SERIES')
subplot(223)
plot(t)
title('SIMPLENARX OUTPUT SERIES')
subplot(224)
plot(zt)
title('STANDARDIZED OUTPUT SERIES')
rng('default')
L = floor(0.95*(2*N-1)) % 189
for i = 1:100 % 1: length of the data
  n = zscore(randn(1,N),1);
  autocorrn = nncorr( n,n, N-1, 'biased');
  sortabsautocorrn = sort(abs(autocorrn));
  thresh95(i) = sortabsautocorrn(L);
end
sigthresh95 = mean(thresh95) % 0.1517
minthresh95 = min(thresh95) % 0.1139 
medthresh95 = median(thresh95) % 0.1497
stdthresh95 = std(thresh95) % 0.0234
maxthresh95 = max(thresh95) % 0.2321 
%%CORRELATIONS
%%%%%TARGET AUTOCORRELATION %%%%%%%
% 
autocorrt = nncorr(zt,zt,N-1,'biased');
sigflag95 = -1+ find(abs(autocorrt(N:2*N-1))>=sigthresh95) %significant Feedback Delay (FD) => [0 2 3 4 5 7 9 10 12 67 69]
% 
plt = plt+1, figure(plt);
hold on
plot(0:N-1, -sigthresh95*ones(1,N),'b--')
plot(0:N-1, zeros(1,N),'k')
plot(0:N-1, sigthresh95*ones(1,N),'b--')
plot(0:N-1, autocorrt(N:2*N-1))
plot(sigflag95,autocorrt(N+sigflag95),'ro')
title('SIGNIFICANT TARGET AUTOCORRELATIONS (FD)')
%
%%%%%%INPUT-TARGET CROSSCORRELATION %%%%%%
%
crosscorrxt = nncorr(zx,zt,N-1,'biased');
sigilag95 = -1 + find(abs(crosscorrxt(N:2*N-1))>=sigthresh95) %significant Input Delay (ID) => [0 1 3 4 5 6 8 10 13 17]
% 
plt = plt+1, figure(plt);
hold on
plot(0:N-1, -sigthresh95*ones(1,N),'b--')
plot(0:N-1, zeros(1,N),'k')
plot(0:N-1, sigthresh95*ones(1,N),'b--')
plot(0:N-1, crosscorrxt(N:2*N-1))
plot(sigilag95,crosscorrxt(N+sigilag95),'ro')
title('SIGNIFICANT INPUT-TARGET CROSSCORRELATIONS (ID)')
%%Using Fixed ID and FD to Find Optimal Number of Hidden Node
%
subset_ID_FD = intersect(sigflag95, sigilag95)
Opti_ID_FD = max(subset_ID_FD);
Ntrn = N-2*round(0.15*N) % default 0.7/0.15/0.15 trn/val/tst ratios
trnind = 1:Ntrn;
Ttrn = T(trnind);
Ntrneq = prod(size(Ttrn)) % Product of element
%ID = 1:2 %default for Prediction
ID = 1:Opti_ID_FD; % 0:2 % Regression (default)
FD = 1:Opti_ID_FD; % 1:2 % default (default)
NID = length(ID); % 10
NFD = length(FD); % 10
LDB = max([ID,FD]) % Length of the delay buffer = 10
Hub = floor((Ntrneq-O)/(NFD*O+O+1)) % 5
Hmax = Hub; % 2 is sufficient to get R2=0.999
dH =1;
Hmin = 1;
Ntrials = 10;
%
trainFcn = 'trainbr'
%
rng('default')
j=0
for h = Hmin:dH:Hmax
  j=j+1
  if h==0
      neto = narxnet(ID,FD,[],'open',trainFcn);
      Nw = (NID*I+NFD*O+1)*h+(h+1)*O;
  else
      neto = narxnet(ID,FD,h);
      Nw = (NID*I+NFD*O+1)*h+(h+1)*O;
  end
  Ndof = Ntrneq-Nw % Ndof <=0 for H >= 4
  neto.divideFcn = 'dividetrain'; % No data division
  neto.performFcn = 'mse';
  [Xo Xoi Aoi To ] = preparets(neto,X,{},T);
  to = cell2mat(To);
  MSE00o = var(to,1)
  MSE00oa = var(to,0)
  MSEgoal = 0.005*max(Ndof,0)*MSE00oa/Ntrneq
  MinGrad = MSEgoal/100
  neto.trainParam.goal = MSEgoal;
  neto.trainParam.min_grad = MinGrad;
%   
  for i= 1:Ntrials
      % Save state of RNG for duplication
      s(i) = rng; 
      neto = configure(neto,Xo,To);
      [neto tro Yo Eo Xof Aof ] = train(neto,Xo, To, Xoi, Aoi);
      % Eo = gsubtract(To,Yo);
      % stopcrit{i,j} = tro.stop;
      R2o(i,j) = 1 - mse(Eo)/MSE00o;
  end
end
result = [ (Hmin:dH:Hmax); R2o ]
% stopcrit = stopcrit;
elapsedtime = toc % 194.8385
%
% result =
% 
%     1.0000    2.0000    3.0000    4.0000    5.0000
%     0.9966    0.9996    0.9999    1.0000    1.0000
%     0.9968    0.9990    0.9998    1.0000    1.0000
%     0.9967    0.9990    0.9998    1.0000    1.0000
%     0.9967    0.9991    0.9998    1.0000    1.0000
%     0.9968    0.9999    0.9999    1.0000    1.0000
%     0.9974    0.9990    1.0000    1.0000    1.0000
%     0.9970    0.9985    0.9998    1.0000    1.0000
%     0.9976    0.9987    0.9999    1.0000    1.0000
%     0.9986    0.9990    0.9998    1.0000    1.0000
%     0.9967    0.9986    0.9999    1.0000    1.0000

Many thanks! Teck

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Greg Heath le 23 Juil 2016

Ouvrir dans MATLAB Online

With DIVIDEBLOCK indexing

 min( ival ) > max( itrn )
 min( itst ) > max( ival )

However, the most important constraint is

min( itst ) > max( itrn )

Therefore, could have something like

 itrn  = 1 : 2 : Ntrn + Nval - 1 ;
 ival  = itrn + 1 ;
 itst1 = Ntrn + Nval + 1 : N -1 ; 
 itst2 = itst1 + 1 ;
 Hope this helps.
 Greg

Greg Heath le 23 Juil 2016

Ouvrir dans MATLAB Online

> Hi, After a lengthy research, I have finally have better Understanding about ID and FD. I have then put some code together to find the optimal ID and FD and then using these ID and FD to find optimal hidden node for my NARXNet using simplenarx dataset. While I am convince that it is correct but I am not very Confident if this is the correct way of doing things so I would really appreciate any comments/correction if any.

> Some additional question are:

1) Should I use data division such as 60/20/20 in the double for loop?

 You have used TRAINBR for which Nval = 0 and performFcn = msereg
 HOWEVER,you have imposed performFcn = mse and DIVIDETRAIN for which Ntst = 0 .
 VERY CONFUSING!

2) I used intersect command to find the subset of lags, is this correct way of doing it?

 NO.
 GEH1 'WHAT IS THE FOLLOWING COMMAND FOR?'

if true % code end

 GEH2 = 'USE ONLY TRAINING DATA TO DETERMINE DELAYS' 
 GEH3 = 'REMAINING POST STRICTLY VALID ONLY FOR I = O = 1 '
       ' MODIFICATIONS NEEDED FOR MULTIVARIABLE DATA'
 GEH4 = 'CANNOT USE FD = 0 WILL GET ERROR'

> Using Fixed ID and FD to Find Optimal Number of Hidden Node

subset_ID_FD = intersect(sigflag95, sigilag95)

GEH5 = '0 3 4 5 10'

Opti_ID_FD = max(subset_ID_FD);

GEH6 = 'Opti_ID_FD = 10 NOT NECESSARILY OPTIMAL!'

Ntrn = N-2*round(0.15*N) % default 0.7/0.15/0.15 trn/val/tst ratios

GEH7 = 'ABOVE NOT VALID FOR TRAINBR 0.85/0/0.15'

%ID = 1:2 %default for Prediction ID = 1:Opti_ID_FD; % 0:2 % Regression (default)

GEH8 = 'ZERO DELAY IS NOT A MATLAB DEFAULT'

Hub = floor((Ntrneq-O)/(NFD*O+O+1)) % 5

 GEH9 = 'Hub = (Ntrneq-O)/(NID*I+NFD*O+1)= 3.29'
 Hmax = Hub; % 2 is sufficient to get R2=0.999
dH =1;
Hmin = 1;
Ntrials = 10;
%
trainFcn = 'trainbr'
%
rng('default')
j=0
for h = Hmin:dH:Hmax
  j=j+1
  if h==0
      neto = narxnet(ID,FD,[],'open',trainFcn);
      Nw = (NID*I+NFD*O+1)*h+(h+1)*O;
GEH10 = 'Nw = (NID+1)*O'
neto.divideFcn = 'dividetrain'; % No data division
GEH11 = 'NEED NONTRAINING DATA FOR UNBIASED PREDICTION !!'
neto.performFcn = 'mse';
GEH12 ' TRAINBR USES MSEREG WITH NO VAL SET'
[Xo Xoi Aoi To ] = preparets(neto,X,{},T);
to = cell2mat(To);
MSE00o = var(to,1)
MSE00oa = var(to,0)
MSEgoal = 0.005*max(Ndof,0)*MSE00oa/Ntrneq
GEH13 = 'ONLY USE TRAINING DATA TO COMPUTE TRAINING PARAMETERS'
GEH14 = SUGGEST LOOKING AT TRAINING RECORD tro.

Hope this helps.

Greg

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Greg Heath le 23 Juil 2016

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/296646-finding-optimal-id-fd-and-hidden-nodes-for-narxnet#answer_229538

Ouvrir dans MATLAB Online

> Hi, After a lengthy research, I have finally have better Understanding about ID and FD. I have then put some code together to find the optimal ID and FD and then using these ID and FD to find optimal hidden node for my NARXNet using simplenarx dataset. While I am convince that it is correct but I am not very Confident if this is the correct way of doing things so I would really appreciate any comments/correction if any.

> Some additional question are:

1) Should I use data division such as 60/20/20 in the double for loop?

 You have used TRAINBR for which Nval = 0 and performFcn = msereg
 HOWEVER,you have imposed performFcn = mse and DIVIDETRAIN for which Ntst = 0 .
 VERY CONFUSING!

2) I used intersect command to find the subset of lags, is this correct way of doing it?

 NO.
 GEH1 'WHAT IS THE FOLLOWING COMMAND FOR?'

if true % code end

 GEH2 = 'USE ONLY TRAINING DATA TO DETERMINE DELAYS' 
 GEH3 = 'REMAINING POST STRICTLY VALID ONLY FOR I = O = 1 '
       ' MODIFICATIONS NEEDED FOR MULTIVARIABLE DATA'
 GEH4 = 'CANNOT USE FD = 0 WILL GET ERROR'

> Using Fixed ID and FD to Find Optimal Number of Hidden Node

subset_ID_FD = intersect(sigflag95, sigilag95)

GEH5 = '0 3 4 5 10'

Opti_ID_FD = max(subset_ID_FD);

GEH6 = 'Opti_ID_FD = 10 NOT NECESSARILY OPTIMAL!'

Ntrn = N-2*round(0.15*N) % default 0.7/0.15/0.15 trn/val/tst ratios

GEH7 = 'ABOVE NOT VALID FOR TRAINBR 0.85/0/0.15'

%ID = 1:2 %default for Prediction ID = 1:Opti_ID_FD; % 0:2 % Regression (default)

GEH8 = 'ZERO DELAY IS NOT A MATLAB DEFAULT'

Hub = floor((Ntrneq-O)/(NFD*O+O+1)) % 5

 GEH9 = 'Hub = (Ntrneq-O)/(NID*I+NFD*O+1)= 3.29'
 Hmax = Hub; % 2 is sufficient to get R2=0.999
dH =1;
Hmin = 1;
Ntrials = 10;
%
trainFcn = 'trainbr'
%
rng('default')
j=0
for h = Hmin:dH:Hmax
  j=j+1
  if h==0
      neto = narxnet(ID,FD,[],'open',trainFcn);
      Nw = (NID*I+NFD*O+1)*h+(h+1)*O;
 GEH10 = 'Nw = (NID+1)*O'

neto.divideFcn = 'dividetrain'; % No data division

GEH11 = 'NEED NONTRAINING DATA FOR UNBIASED PREDICTION !!'

neto.performFcn = 'mse';

 GEH12 ' TRAINBR USES MSEREG WITH NO VAL SET'
 [Xo Xoi Aoi To ] = preparets(neto,X,{},T);
 to = cell2mat(To);
 MSE00o = var(to,1)
 MSE00oa = var(to,0)
 MSEgoal = 0.005*max(Ndof,0)*MSE00oa/Ntrneq
 GEH13 = 'ONLY USE TRAINING DATA TO COMPUTE TRAINING PARAMETERS'
 GEH14 = SUGGEST LOOKING AT TRAINING RECORD tro.

Hope this helps.

*Thank you for formally accepting my answer*

Greg

3 commentaires
Afficher 1 commentaire plus ancienMasquer 1 commentaire plus ancien

Greg Heath le 8 Août 2016

Ouvrir dans MATLAB Online

GEH0: Some of your equations are only valid for one-dimensional signals

GEH1: Use Ntrn (not N) to estimate significant lags

GEH2: imax is NOT the length of the data. It is number of repetitions to use for estimating summary statistics

GEH3: In general, siglag95 = medthresh95 is a better choice than meanthresh95.

GEH4: What you did to try to estimate the optimal set of significant lags makes absolutely no sense to me.

GEH5: Hub = 3.3182, Hmax = floor(Hub)

GEH6: Why didn't you use Hmin = 0?

GEH7: Why in the world did you use TRAINBR instead of the default TRAINLM???

GEH8: Initialize the RNG before the outer (H) loop

GEH9: Although you did not use h = 0;

H=0 ==> Nw = (NID*I + NFD*O + 1)*O

GEH10: Use tro to obtain the performance of the test subset.

%I am using trainbr and diviveblock. one thing that i am unsure is that, %I have to redefine the neto again as below for h>0

GEH11: Look again: You did not define neto for h = 0 !!!. However, If you want to use the same neto all you have to do is change some of the properties.

%I also noticed that my 'plotinerrcorr' has many elements > the threshold even though in my point of view, ID=FD=10 and hidden node=3 gives me the best result. Do I have to be concern about this?

GEH12: You have not defined a threshold for the input/error cross correlation

GEH13: I did not understand your method; however, I am quite sure that it did not "optimize" the ID/FD combination.

Your method did not determine "the" optimal combination ID, FD.

Since you used TRAINBR, I do not know exactly what you optimized. That goal is typically a weighted linear combination of SSE and SSW. You will have to back the weight from your result.

Try it & I will try to match your result.

Hope this helps.

Greg

P.S. Why would you use a non default value for the trainFcn???