Neural Network help

9 vues (au cours des 30 derniers jours)
Amjad
Amjad le 1 Avr 2012
Commenté : Greg Heath le 13 Fév 2016
Hello everyone, I'm trying to create a neural network which relates voltage(input) to air gap depth(output). There is no training happening after I use the train command, my network outputs the target values. I want to fix that. Also, something extra, each value in one sample in the input corresponds to a frequency value.(for example V1=[1 2 3], the value 1 is at 1GHz frequency, 2 at 1.1GHz...etc) Is there any way to relate the inputs to the frequency maybe as a pair?
-The code I used:
D = [4 8 12 16 20 24 28 32];
DO = [D;D;D].'; %the air gap depth for each sample is the same, why can't I use one "D" ?
V2 = [777 786 780 772 774 767 762 753];
V3 = [595 600 602 598 606 605 602 590];
V4 = [580 581 594 602 603 609 630 633];
VI = [V2;V3;V4].';
NET = feedforwardnet(10);
NET.trainParam.goal=1e-10; %I tried to change the error to see if there is any change
NET = train(NET,VI,DO);
%Testing
Y = NET(VI)

Réponse acceptée

Greg Heath
Greg Heath le 4 Avr 2012
The concept is very simple:
NNs are good interpolators.
NNs are NOT GOOD extrapolators.
Regardless of the physical source of the data:
1. The input and output must be correlated. This does not necessarily mean that there is a causal relationship. Both may be the result of the sanother source, known or uknown, for which there are no direct measurements.
2. If the statistical characteristics of the nontraining data are not similar to those of the training data, the net should not be expected to perform well.
If new data does not fulfill this criterion, a new net with representative training data should be designed.
Hope this helps.
Greg
  2 commentaires
Rita
Rita le 13 Fév 2016
If we use dividrand to divide training and test and validation sets how do we know that the statistical characterstics of unseen data and seen data are similar ?
Greg Heath
Greg Heath le 13 Fév 2016
Already answered previously:
The primary BASIC ASSUMPTION of almost any design using training data that is to operate on nontraining data is that both sets of data can be assumed to be random samples from the same probability distribution.
It is easy to check using basic analytic and graphical principles of pattern recognition.
For example:
1. Define the training data as class 1 and the
nontraining data as class 2.
2. 2-color 1, 2, and/or 3-D plots
3. Design an analytic or neural classifier to try
to separate them.
I TEND TO USE THE GRAPHICAL (2) APPROACH.
Hope this helps.
Greg

Connectez-vous pour commenter.

Plus de réponses (6)

Greg Heath
Greg Heath le 2 Avr 2012
Neural Network help Amjad asked about 7 hours ago Latest activity: Answer by Geoff about 7 hours ago >Hello everyone, I'm trying to create a neural network which relates voltage(input) to air gap depth(output).
Physical explanation please: How does a voltage affect air gap depth?
>There is no training happening after I use the train command, my network outputs the target values. I want to fix that.
I don't understand. What do you want it to output? What is there to fix?
>D = [4 8 12 16 20 24 28 32];
>DO = [D;D;D].'; %the air gap depth for each sample is the same, why can't I use one "D" ?
INCORRECT. The target is just D: Eight scalar outputs for eight 3-dimensional inputs.
>V2 = [777 786 780 772 774 767 762 753];
>V3 = [595 600 602 598 606 605 602 590];
>V4 = [580 581 594 602 603 609 630 633];
>VI = [V2;V3;V4].';
INCORRECT. No transpose
> Also, something extra, each value in one sample in the input corresponds to a frequency value.(for example V1=[1 2 3], the value 1 is at 1GHz frequency, 2 at 1.1GHz...etc) Is there any way to relate the inputs to the frequency maybe as a pair?
Well, not so much as a pair. Just use a 6-dimensional input that includes the three frequencies. Then compare results.
>[ I N ] = size(VI)
>NET = feedforwardnet(10);
H is probably too large to obtain anything meaningful
>NET.trainParam.goal=1e-10; %I tried to change the error to see if there is any change
>NET = train(NET,VI,DO);
>%Testing
>Y = NET(VI)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% If I understand the problem correctly:
close all, clear all, clc
% Semicolons removed for debugging
D = [4 8 12 16 20 24 28 32] % target
[ O N ] = size(D) % [ 1 8 ]
% Using default division ratios, N = 8 will split up to Ntrn =6, Nval=1, Ntst=1
% For accuracy obviously need to
% 1. Use 8-fold crossvalidation
% 2. Exclude the validation Set
% 3.Train to training set convergence (e.g., adjusted R^2 goal of 0.99) or better yet, use regularization with MSEREG.
Ntrn = 7, Nval = 0, Ntst = 1
Neq = Ntrn*O % No. of training equations = 7
V2 = [777 786 780 772 774 767 762 753]
V3 = [595 600 602 598 606 605 602 590]
V4 = [580 581 594 602 603 609 630 633]
V = [ V2; V3; V4 ] % No transpose
[ I N ] = size(V) % [ 3 8 ]
% No. of hidden nodes, H
% Network node topology: I-H-O = 3-H-1
% Number of unknown weights Nw = (I+1)*H+(H+1)*O = 1+5*H
Hub = floor((Neq-O)/(I+O+1)) % 1 H <= Hub for Neq >= Nw
% Try H=0:1
% NOTE: Below I will use all data for training and resubstitute it for testing.
% If this is serious work for you either get more data or use 8-fold cross validation with a linear regression model (See the Statistics Toolbox)
% Naive Constant Model
% Using all data for training.
Neq = N*O % 8
y00 = mean(D,2) % 18
Nw00 = O % 1
e00 = (D-y00)
MSE00 = sse(e00) /Neq % 84
MSE00a = sse(e00)/(Neq-Nw00) % 96
% Linear Model (H =0)
W0 = D/[ ones(1,N);V]
Nw0 = numel(W0) % (I+1)*O = 4
y0 = W0*[ones(1,N);V]
e0 = D-y0
MSE0 = sse(D-y0)/Neq % 2.9472
NMSE0 = MSE0/MSE00 % 0.0351
R20 = 1-NMSE0 % 0.9649
MSE0a = sse(D-y0)/(Neq-Nw0) % 5.8944
NMSE0a = MSE0a/MSE00a % 0.0614
R20a = 1-NMSE0a % 0.9386
% Neural Net Model
Ntrials = 10
rng(0)
j=0
for h = 0:1
j=j+1
H = h
for i = 1:Ntrials
if H == 0.
net = feedforwardnet([]);
Nw = (I+1)*O
else
net = feedforwardnet(H);
Nw = O+(I+O+1)*H
end
MSEgoal =0.01*(Neq-Nw)*var(D)/Neq
net.trainParam.goal = MSEgoal ;
net.divideParam.trainRatio =1 ;
net.divideParam.valRatio = 0 ;
net.divideParam.testRatio = 0 ;
net.trainParam.show = NaN ;
[net tr ] = train(net,V,D) ;
Nepochs(i,j) = tr.epoch(end);
MSE = tr.perf(end);
NMSE = MSE/MSE00
R2(i,j) = 1- NMSE
MSEa = Neq*MSE/(Neq-Nw)
NMSEa = MSEa/MSE00a
R2a(i,j) = 1- NMSEa
end
end
H = 0:1
R2 = R2
R2a = R2a
% H = 0 1
% R2 = 0.9649 0.9782 Obtained 9 times
% 0.9649 0.5905 <== NOTE: Obtained once
% R2a = 0.9386 0.9236 Obtained 9 times
% 0.9386 -0.4333 <== NOTE: Obtained once
% Resub Testing on last net
Y = net(V)
E = D-Y
MSE = sse(E)/Neq % 1.8343
NMSE = MSE/MSE00 % 0.0218
R20 = 1-NMSE % 0.9782
MSEa = sse(E)/(Neq-Nw) % 7.3372
NMSEa = MSEa/MSE00a % 0.0764
R2a = 1-NMSEa % 0.9236
  1 commentaire
Amjad
Amjad le 2 Avr 2012
Hello Greg, I really appreciate your help, please check my response below

Connectez-vous pour commenter.


Greg Heath
Greg Heath le 2 Avr 2012
> 1. What exactly are we trying to accomplish here ? Just finding the error ?
No, finding a reference for MSE normalization.
The R-squared statistic and the adjusted R-squared statistic are based on normalized mean-square-errors and normalized adjusted mean-squared errors. They are often regarded as the fraction of target variance that is "explained" by a model. They are also known by the statistical term "Coefficient of Determination". See
The normalization reference is the MSE when the output is a constant, independent of the input. To minimize that MSE, the constant must be the average of the target values, mean(target,2). I've labeled the resulting MSE as MSE00.
Whenever MSEs are estimated from the training data, they are optimistically biased. To reduce the bias, instead of dividing the sum-squared-error by the number of errors, Neq, the divisor is the resulting number of degress of freedom that remain after the unknown parameters are estimated. If the number of estimated parameters is Np, the divisor is Neq-Np instead of Np. The result is called the "adjusted" MSE (MSEa).
MSE estimates based on nontraining (e.g., val & tst) data require no adjustment for bias.
Since 0 <= R^2 <= 1 is independent of data scaling, it is an extremely convenient measure of performance.
However, since the measure is biased, it is often replaced by or used with Ra^2.
> 2. What is the difference between the "linear model" and the "naive model"?
The naive constant model has a constant output, independent, of the input. It's importance is as a reference for normalization that yields a scale invariant measure of performance. In fact, the normalization is an excellent way to specify a goal for neural net training.
The output of a linear model is just a linear combination of the inputs. It is the simplest practical model whos performance will be a lower bound for neural neural nets with a hidden layer.
In addition, it is a useful debugging reference for the loop code when H=0 and 1.
>3. What's the point of the loop if we are using the last network created?
In general, the best model is not the last.
The point of the loop is to find the best or best group of many designs. Since weight searches start with random weight initializations, some training attempts will fail and the resulting designs will be useless.
>4. Correct me if I'm wrong, but can't we just create one network where H=1?
No.
Notice that when H = 1, there was one failed design that resulted in R^2 = 0.59 and Ra^2 = -0.43 ! In general, the failure rate is higher.
In more complicated cases when I don't know the optimal value for H, I loop over a range of values for H and use Ntrials = 10 random weight initializations for each value of H.
For examples, search the Newsgroup and Answers using
heath Ntrials
>5. As you can see I'm getting exactly the same results for all values, which is wrong.
I will check your result on the nontraining data. However, I am confident that It is probably the result of the MATLAB default data normalization to [-1 1].
How good are your results using the nonneural linear classifier?
Hope this helps.
Greg

Geoff
Geoff le 1 Avr 2012
You are 'testing' your net with exactly the same data that trained it. Of course you will get your targets back out. You need to test it against another dataset, or partition your data into two groups: one for training, one for testing.
  2 commentaires
Amjad
Amjad le 1 Avr 2012
I understand your point, but even when I'm testing with a different set, I'm not getting the right answers(I tried with actual and imagined data sets). Here is a pic. for the training tool(no progress): http://i.imgur.com/AkpyY.png
Geoff
Geoff le 2 Avr 2012
Oh, I think you are specifying your data incorrectly. Don't you want three input variables and one output? In that case, do not transpose the VI matrix. Also, pass D as it is, instead of constructing that D0 matrix. The neural net diagram should show: Input 3 / Hidden 10 / Output 1 / Output 1.

Connectez-vous pour commenter.


Amjad
Amjad le 2 Avr 2012
Hello Greg, I really appreciate your help
>Physical explanation please: How does a voltage affect air gap depth?
Well, I'm working on a project where I measure the voltage that corresponds to a different air gaps beneath a concrete slab, and at different frequencies. I'm hoping to make a NN where I can input voltage values and estimate the air gap.
I have looked at the code you provided and I have some questions if that's ok.(I'm bit naive when it comes to NN)
% Naive Constant Model
Neq = N*O % 8
y00 = mean(D,2) % 18
Nw00 = O % 1
e00 = (D-y00)
MSE00 = sse(e00) /Neq % 84
MSE00a = sse(e00)/(Neq-Nw00) % 96
What exactly are we trying to accomplish here ? Just finding the error ?
% Linear Model (H =0)
W0 = D/[ ones(1,N);V]
Nw0 = numel(W0) % (I+1)*O = 4
y0 = W0*[ones(1,N);V]
e0 = D-y0
MSE0 = sse(D-y0)/Neq % 2.9472
NMSE0 = MSE0/MSE00 % 0.0351
R20 = 1-NMSE0 % 0.9649
MSE0a = sse(D-y0)/(Neq-Nw0) % 5.8944
NMSE0a = MSE0a/MSE00a % 0.0614
R20a = 1-NMSE0a % 0.9386
What is the difference between the "linear model" and the "naive model"?
% Neural Net Model
What's the point of the loop if we are using the last network created? Correct me if I'm wrong, but can't we just create one network where H=1?
So, I have another set of data that I tried on the network, which is the following
% Resub Testing on last net
Va = [122.8 134.3 164.5 172.6 172.9 181.6 181.1 188.8];
Vb = [181.9 176.2 174.6 165.8 156.6 144.8 132.5 118.3];
Vc = [223 216.3 215.2 216.6 220.8 227.9 233.1 243.1];
M=[Va;Vb;Vc;];
Y = net(M)
E = D-Y
MSE = sse(E)/Neq % 1.8343
NMSE = MSE/MSE00 % 0.0218
R20 = 1-NMSE % 0.9782
MSEa = sse(E)/(Neq-Nw) % 7.3372
NMSEa = MSEa/MSE00a % 0.0764
R2a = 1-NMSEa % 0.9236
Y =
4.9957 4.9957 4.9957 4.9957 4.9957 4.9957 4.9957 4.9957
As you can see I'm getting exactly the same results for all values, which is wrong. (the output should be similar to the previous set, which is 4 8 12 16 20 24 28). Even with imagned values the result is always 4.9957
thanks
  1 commentaire
Greg Heath
Greg Heath le 3 Avr 2012
1.Resub Testing means testing with the training set. Therefore, delete the modifier "Resub"
2. A neural net is designed to be applied to data that can be considered to belong to the same probability distribution as the design data
3. Clearly, M does not fulfill that criterion
[ minmax(V) minmax(M) ]
753.0000 786.0000 122.8000 188.8000
590.0000 606.0000 118.3000 181.9000
580.0000 633.0000 215.2000 243.1000
plot(V)
hold on
plot(M)
Hope this helps.
Greg

Connectez-vous pour commenter.


Greg Heath
Greg Heath le 3 Avr 2012
1.Resub Testing means testing with the training set. Therefore, delete the modifier "Resub"
2. A neural net is designed to be applied to data that can be considered to belong to the same probability distribution as the design data
3. Clearly, M does not fulfill that criterion
[ minmax(V) minmax(M) ]
753.0000 786.0000 122.8000 188.8000
590.0000 606.0000 118.3000 181.9000
580.0000 633.0000 215.2000 243.1000
plot(V)
hold on
plot(M)
Hope this helps.
Greg
  1 commentaire
Amjad
Amjad le 4 Avr 2012
So the reason that the network is not performing well is because of the data? I pretty confident in the data because they were measured in the lab. The data I'm getting have a lot of variations, so maybe if I got more data and used it all to train the network, the results will be better ?
Also, I'm looking into other classifiers, such as the linear regression, but I'm not sure if that will work because my data is not exactly linear. Is there any classification method you recommend?
Again, I really appreciate your help :)

Connectez-vous pour commenter.


Amjad
Amjad le 3 Avr 2012
So the reason that the network is not performing well is because of the data? I pretty confident in the data because they were measured in the lab. The data I'm getting have a lot of variations, so maybe if I got more data and used it all to train the network, the results will be better ?
Also, I'm looking into other classifiers, such as the linear regression, but I'm not sure if that will work because my data is not exactly linear. Is there any classification method you recommend?
Again, I really appreciate your help :)
  1 commentaire
Amjad
Amjad le 4 Avr 2012
Also, I got another set of values if it helps
V1 = [162.8 172.3 166.1 166.7 160.9 147.9 145.5 140.4];
V2 = [176.7 172.2 180.6 182.8 188.3 196.8 201.9 206.1];
V3 = [167.5 161.2 156 147.1 139.9 131.8 126.6 123.7];
V4 = [162.6 172.5 176.8 183.9 188.7 188.8 193.3 193.9];
V5 = [253.3 257.1 251.5 252.4 250.6 238.7 235.9 230.8];
V6 = [202.8 197 200.5 200.7 202.1 206.6 207.6 207.8];
V7 = [117.4 110.6 113.5 112.7 113 118.7 119.4 120.5];
V8 = [51.5 51.9 51.4 51.9 51.9 51 50.8 50.5];
V9 = [38.4 38.3 38.5 38.4 38.5 38.6 38.6 38.6];
V10 = [27.5 27.3 27.1 27.4 27.5 27.2 27.1 27];
V11 = [19.5 19.6 19.5 19.4 19.4 19.3 19.3 19.2];
V12 = [3.9 3.9 3.9 3.9 3.9 3.9 3.9 3.9];
V13 = [24.6 24.5 24.6 24.5 24.5 24.5 24.5 24.4];
V14 = [24.4 24.2 24.1 24.4 24.5 24.4 24.4 24.3];
V15 = [17.1 17 17.1 16.9 17 17.1 17.1 17.1];
V16 = [4.1 4 4 4.1 4 4.2 4.1 4.1];
V17 = [1.2 1.2 1.3 1.2 1.2 1.3 1.2 1.2];
V18 = [3.1 3.1 3.1 3.2 3.2 3.2 3.2 3.2];
V19 = [0.3 0.3 0.2 0.3 0.3 0.3 0.3 0.3];
V20 = [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5];
V21 = [2 2 2 2 2 2 2 2];
V22 = [3.1 3.1 3 3 3.1 3.1 3 3];

Connectez-vous pour commenter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by