NARX closed loop prediction: starting value problem

Hello everyone,
I am using NARX to do a multistep prediction on several time series. After training it in open loop, the net is converted to closed loop for prediction (default workflow).
Many times, the error of the first predicted values is big (see picture below). This get better by strongly increasing number of prediction steps. For me this indicates a reasonable generalisation ability after the training procedure, and this is why I am focusing on this "starting value" aspect, instead of further improvement of delays, size of hidden layer etc...
Is there any possibility to "define" a starting value for the closed loop Narx? I wonder if it is possible to use an overlap of openloop and closed loop data to let the narxnet "recognize" last open loop timesteps. I read a lot about data division in this context. Usually I use 'divideblock' to define training, validation and testing subsets. Last subset is the test subset, which is - of course - not used for calibration. In this case overlapping of openloop and closedloop data with just a few timesteps does not make sense because the net simply can't recognize values of the testing-subset. I tested the possibiliy of overlapping the data set, using 'dividerand', which indeed did improve errors for first prediction values. But I would prefer not to use 'dividerand'. I also tried to use 'divideind' and simply change the order of training, and testing set indices like this:
%documentation example
%[trainInd,valInd,testInd] = divideind(3000,1:2000,2001:2500,2501:3000);
[trainInd,valInd,testInd] = divideind(3000,1001:3000,1:500,501:1000,);
Is this a bad idea? I had some troubles to write the code to simply test this idea. This is why I am asking.
Thanks for shearing your thoughts and ideas with me
Andrew

 Réponse acceptée

Thomas Schattschneider
Thomas Schattschneider le 21 Mar 2017

0 votes

Hello Andreas,
I am working on a similar problem right now and might have something to help you.
Look at the section "Multistep Closed-Loop Prediction Following Known Sequence" of Multistep Neural Network Prediction. It describes how to get the correct initial input and layer states for your network in order to get the correct following predictions. This could be what you asked for (defining starting values). At least for me, this solved the same problem I had with my NARX time series prediction.

3 commentaires

Hello Thomas,
Thanks! I'll take a look at it.
Hello Thomas,
I have read this site a hundred times before, but now I found something I have missed somehow.
% Test the Network
y = net(x,xi,ai) %this is wrong or at least insufficient
[y,xf,af] = net(x,xi,ai); %this is what you need
I simply did forget to create xf and af. This is what follows:
[netc,xic,aic] = closeloop(net,xf,af); %close the loop
Now it works perfectly.Thank you!
Happy to hear that it works!
I was a little bit confused too, since a different method of prediction is shown in Design Time Series NARX Feedback Neural Networks for multistep prediction. Both methods work for me, but the prediction method with following the known sequence gives way better results.

Connectez-vous pour commenter.

Plus de réponses (1)

Greg Heath
Greg Heath le 15 Mar 2017

1 vote

1. There is an erroneous shift between the blue and green curves because the delay is not properly accounted for.
2. Low error rates on openloop training/validation/testing does not automatically carry over when the loop is closed. In particular, I have illustrated examples where closing the loop can yield low training/validation subset errors but fails tragically on the test subset.
3. I have not been able to deal with this directly except to try many designs differing via initial weights. However, even this is not guaranteed to work.
4. None of these cases have received a response from MATLAB.
5. See my tutorials in the NEWSGROUP.
Hope this helps.
Thank you for formally accepting my answer
Greg

5 commentaires

Hi Greg,
Concerning the designation of all significant delays... I refer to: https://de.mathworks.com/matlabcentral/newsreader/view_thread/346212#948178
Your code contained:
% USE NTRN TO DETERMINE STATISTICALLY SIGNIFICANT
% CORRELATIONS
Nrep = 200 % 100 & 200==>Number of reps to estimate summary stats
% Since signals are independently random only need xcorrn
rng(0)
for i = 1:Nrep
n1 = zscore(randn(1,Ntrn),1);
n2 = zscore(randn(1,Ntrn),1);
xcorrn = nncorr( n1,n2, Ntrn-1, 'biased');
sortabsxcorrn = sort(abs(xcorrn));
thresh95(i) = sortabsxcorrn(Ltrn);
end
Actually I understood fairly what you did, but I am not sure why. Why do you use random numbers? Is this because your example signals are random or is this it, what I have to use for every Input signal?
I also have some difficulties to understand what I can do with the Error-Autocorrelation plot and Input-Error Cross-Correlation plot in the nntraintool...does this also help me determining delays? Now I refer to: https://de.mathworks.com/matlabcentral/answers/56653-matlab-time-series-tool-how-to-read-ie-cross-correlation
I don't understand how to react if the net does not accurately model all of the salient characteristics. What does it look like and what do I have to change?
I really appreciate you help, thanks!
Andrew
A statistician (or statistics book) can give you a precise definition. I prefer to use my engineering common sense:
If 95% of Gaussian random noise correlations are below a certain level then I declare that any signal correlations below that level are not significant.
Therefore I declare that the lags at which signal correlations are above that level are significant.
Consequently, if I wish to predict future signal levels I will choose a subset of signal values at lags that occur where the signal correlations are significant.
Sometimes I may include values at nonsignificant lags for artistic reasons. For example, suppose lags 1,3,4,5 are significant. Then, as a whim, I just might include lag = 2. If my previous calculations are correct, I would expect that the weights corresponding to lag = 2 will be insignificantly small.
For nontraining (i.e., validation and test) data predictions to be as unbiased as possible, it is preferable that the significant lags are determined from training data only.
Don't forget that these predictions are based on a naïve assumption that the signal statistics are stationary, i.e., the signal means and covariances (includes variances and correlations) are constant!
Hope this helps.
Greg
Hi Greg,
thank you very much for your help!
Now I got it.
Andreas Wunsch
Andreas Wunsch le 21 Mar 2017
Modifié(e) : Andreas Wunsch le 21 Mar 2017
Hi Greg,
just one more question. It doesn't make sense to use delays that which have a value bigger than the length of the forecasting period, right?
For example:
ID = 0:15
FD = 1:23
to predict 5 timesteps. In this case I can't even fill up my tapped delay line...
Thanks again!
Andrew
Use the smallest delay lengths that give you the answer you want (or will accept).
In addition to trying to minimize the number of hidden nodes, you can try to minimize the number of delays.
Since neither is a continuous variable, intelligent trial and error is the only approach that I would use.

Connectez-vous pour commenter.

Catégories

En savoir plus sur Deep Learning Toolbox dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by