Implementing initial weights and significant feedback delays in a NARNET

7 vues (au cours des 30 derniers jours)
Peta
Peta le 21 Avr 2015
Commenté : Greg Heath le 13 Oct 2015
Hi. I’m trying to understand the concepts behind finding training strategies for NARNETs that can make as good predictions as possible. What I want to create is a script that I can feed any time series to, regardless of how it looks, and then find the best training design for it. This is the code I have at the moment:
T = simplenar_dataset; %example time series
N = length(T); % length of time series
MaxHidden=10; %number of hidden nodes that will be tested
%Attempt to determine Significant feedback delays with Autocorrelation
autocorrT = nncorr(zscore(cell2mat(T),1),zscore(cell2mat(T),1),N-1);
[ sigacorr inda ] = find(abs(autocorrT(N+1:end) > 0.21))
for hidden=1:MaxHidden
parfor feedbackdelays=1:length(inda)
FD=inda(feedbackdelays);
net = narnet( 1:FD, hidden );
[ Xs, Xsi, Asi, Ts ] = preparets( net, {}, {}, T );
ts = cell2mat( Ts );
net.divideFcn ='divideblock'; %Divides the data using divide block
net.trainParam.min_grad=1e-15;
net.trainParam.epochs=10000;
rng( 'default' )
[ net tr Ys Es Af Xf ] = train( net, Xs, Ts, Xsi, Asi);
NMSEs = mse( Es ) /var( ts,1 )% Mean squared error performance function
performanceDivideBlockNMSEs(hidden,feedbackdelays)=NMSEs;
end
end
First off: Is this the correct way of implementing the statistically significant feedback delays?
And if the “net.divideFcn ='divideblock'” line is left uncommented as in the code now I get an error message in the loop saying “Attempted to access valInd(0); index must be a positive integer or logical.” which I’m not sure what is causing.
And I’ve heard people say that you should “try different initial weights”, how do I do that, is it the rng command I need to change?
The idea here is then that I find the address of the best performing net in the performanceDivideBlockNMSEs matrix so I can retrain a closed net with those settings and make predictions, but for now I’m just focusing on the open net.
Thanks

Réponse acceptée

Greg Heath
Greg Heath le 22 Avr 2015
1. Unfortunately, the form of NNCORR that you are using is BUGGY!
PROOF:
a. plot(-(N-1):N-1, autocorrT)
b. minmax(autocorrT) = [ -2.3082 1.0134 ]
c. sigacorr = ones(1,41)
2. BETTER SOLUTION: Use the Fourier Method
za = zscore(a,1); zb = zscore(b,1); % a,b are double (i.e., not cells)
A = fft(za); B = fft(zb);
CSDab = A.*conj(B); % Cross Spectral Density
crosscorrFab = ifft(CSDab); % F => Fourier method
crosscorrFba = conj(crosscorrFab);
3. You might wish to compare this with the NNCORR documentation options
help nncorr
doc nncorr
% The optional FLAG determines how nncorr normalizes correlations.
% 'biased' - scales the raw cross-correlation by 1/N.
% 'unbiased' - scales the raw correlation by 1/(N-abs(k)), where k
% is the index into the result.
% 'coeff' - normalizes the sequence so that the correlations at
% zero lag are identically 1.0.
% 'none' - no scaling (this is the default).
crosscorrBab = nncorr( za, zb, N-1, 'biased' ); % B ==> "b"iased
crosscorrNab = nncorr( za, zb, N-1, 'none' )/N; % N ==> "n"one
crosscorrUab = nncorr( za, zb, N-1, 'unbiased' ); % U ==> "u"nbiased
crosscorrtMab = nncorr( za, zb, N-1 ); % M ==> "m"issing flag
% crosscorrCab = nncorr( za, zb, N-1, 'coeff' ); ERROR: BUG
You should find that B & N are equivalent, Similarly for U & M.
Therefore, there are really only 2 NNCORR options to consider: Biased and Unbiased.
It is instructive to overlay the plot combinations F&B, F&U, B&U. Most notable is that for lags greater than ~N/2 the three are, in general, quite different. Although the differences are much less for lags < N/2, I recommend using the Fourier method or one of the correlation functions from other toolboxes.
4. Once thresholding yields the "significant" lags, use as few lags and hidden nodes as possible to avoid "overfitting". Performance on non-training data tends to decrease as the ratio of number of unknown parameters to number of training equations increases.
Hope this helps.
Thank you for formally accepting my answer
Greg
  9 commentaires
Peta
Peta le 27 Avr 2015
Oh my, I don’t seem to be getting a lot of things right with this correlation business! thanks for your patience though
%GEH1: a. NN series are rows b. Use UC for cells and LC for doubles
I take it you are suggesting I change the rows/columns placement like so: X=randn(1,N); ? But I don’t understand what you mean with the abbreviations UC and LC.
%GEH4: Are you sure xcorr yields 2*N-1 lags?
According to the documentation “r = xcorr(x) returns the autocorrelation sequence of x” And what I’m getting has length 2*N-1 so it seems reasonable.
% GEH8: Why in the world are you messing around with crosscorrF when you are lucky enough to have Rxx ??
I’m using the crosscorrF from the fourier method because you suggested that it as a good solution, but assuming you are implying that Its unnecessary when I have access to xcorr I’ve now tried rewriting it using only xcorr as:
X=randn(1,N); %Random series
[Rxx,lags] =xcorr(X,'coeff'); %Autocorrelation function of X
RandomCorrelationMatrix(:,1)=abs(Rxx); %Making it into absolute values
RandomCorrelationMatrix(:,2)=lags; %Turn into matrix for easier sorting
RandomCorrelationMatrix=sortrows(RandomCorrelationMatrix,1); %Sort matrix
[RXxx,lagss] =xcorr(cell2mat(T),'coeff');%same procedure but with NN series
NNCorrelationMatrix(:,1)=abs(RXxx); %Correlation values as absolute
NNCorrelationMatrix(:,2)=lagss;
CL95=RandomCorrelationMatrix(floor(0.95*(2*N-1))); 95% confidence level
NNCorrelationMatrix(NNCorrelationMatrix(:,1)<CL95)=nan; %nan instead of [] to keep indexing
Keepers = ~isnan( NNCorrelationMatrix ); %create logic index for values to keep (that isn’t nan)
FD=NNCorrelationMatrix(Keepers(:,1),2); %Feedback delays to be tested in NN
The correlation values are now always handled as absolute values, but should the lag values also be absolute or can they be negative? I’ve noticed that it is possible to give the net negative feedback lags without it complaining, but the performance after training seems to be horrible.
% GEH9: If you are only dealing with narnet autocorrelations why use the notation crosscorr ?
I have yet to reach a mental stage of enlightenment where I fully appreciate the difference between an autocorrelation and a crosscorrelation. Hopefully, with time, I will.
% GEH11: Think? Why don't you read the help documentation and why don't you look at a sample tabulation and plot??
I have, sorry for using the word think in vain, the correlation values have a max value of 1 at zero lag.
%GEH16: This makes no sense at all. Of course the values are nonintegers. Absolute values of nonzero lag correlations are never more than unity. What in the world are you trying to round? %GEH17: The net inputs are a subset of the significant the integer lags, not correlation values.
Yes, that does make a whole lot more sense than what I was trying before, I was confused and was trying to use the correlation values as lags, thanks for clearing that up.
%GEH18: No. 1. The best of multiple designs is chosen by tr.best_vperf ! 2. The UNBIASED estimate of net performance on unseen data is obtained from the value of tr.best_tperf obtained from the net chosen in 1.
I don’t understand the reasoning here: if the unbiased estimate of net performance is acquired from tr.best_tperf, why not chose the best of multiple designs directly from that parameter instead of tr.best_vperf?
Greg Heath
Greg Heath le 13 Oct 2015
If the net with the best tr.best_tperf is chosen, that value is not an unbiased estimate of performance on unseen data (You saw the performance value before you chose the net!).

Connectez-vous pour commenter.

Plus de réponses (1)

Greg Heath
Greg Heath le 29 Avr 2015
%GEH1: a. NN series are rows b. Use UC for cells and LC for doubles
UC/LC: Upper and Lower Case x = cell2mat(X), t = cell2mat(T)
Don't use X for noise. n = rand(1,N);
PROBLEM: NNTOOLBOX based on ROW variables. XCORR appears to operate on columns.
[ Rnnp, nlags] = xcorr(n','coeff') % p for "prime'
[ Rttp, tlags ] = xcorr( t','coeff')
Rnn = Rnnp'; Rtt=Rttp';
absRnn is a ROW which is sorted using sort, NOT sortrow to obtain CL95
absRtt is a ROW which is thresholded to obtain the significant lags
The function FIND can be used to obtain the significant lags.
t Autocorrelation feedback lags are positive ( narnet, narxnet)
x/t Crosscorrelation input lags are nonnegative (timedelaynet, narxnet)
1. The best of multiple designs is chosen by tr.best_vperf !
2. The UNBIASED estimate of net performance on unseen data is obtained from the value of tr.best_tperf obtained from the net chosen in 1.
% I don’t understand the reasoning here: if the unbiased estimate of net performance is acquired from tr.best_tperf, why not chose the best of multiple designs directly from that parameter instead of tr.best_vperf?
If the best net is chosen based on tperf, then the estimate for unseen data is biased.
To choose nets for deployment. I typically
1. Choose all nets with nontraining val and test performance exceeding a threshold.
2. Choose the ones with the smallest number of hidden nodes.
3. If the val and test performances are significantly worse than the training performance,
continue training with ALL of the data using dividetrain.
Hope this helps.
Greg
  4 commentaires
Peta
Peta le 30 Avr 2015
N = ? number of sig lags = ? Explain "loop through".
Since the documentation example configures the feedback delays as net = narnet(1:2,10); I was under the impression that the lags were always supposed to be written in that manner (1 to nr of lags). So by looping though I literally meant:
Siglags=[2 4 6]; %if this is the significant lags you have
%First try defining the net as this in a loop:
net = narnet(1:Siglags(1), HiddenLayerSize);
%then try training with:
net = narnet(1:Siglags(2), HiddenLayerSize);
%then lastly:
net = narnet(1:Siglags(3), HiddenLayerSize);
But judging by your subtle skepticism, this clearly isn’t how it’s supposed to be done? Is the whole vector of Siglags supposed to be fed into the net all at once like: net = narnet(Siglags, HiddenLayerSize); ?
When you say you take the first m siglags and try to minimize m, what do you mean by m?
And regarding the hidden nodes, when you say Hmin:dH:Hmax <= Hub, what is your definition of the variable “Hub”?
Greg Heath
Greg Heath le 1 Mai 2015
Modifié(e) : Greg Heath le 1 Mai 2015
1. Incorrect. use
Siglags(1:m).
1:Siglags(m) can include intermediate lags that are not significant.
2. I choose m by nonsystematic trial and error.
3. Again, to avoid overfitting, I try to minimize m.
4. Did it ever occur to you to search in NEWSGROUP or ANSWERS using
greg Hub
Greg

Connectez-vous pour commenter.

Catégories

En savoir plus sur Parallel and Cloud dans Help Center et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by