Struggling to Improve Neural Network Performance

Question

0 votes

I am working on creating a function fitting neural network with the neural network toolbox but I haven't had much success getting it to work correctly. I have an input matrix with two features. I currently use fitnet (I've tried cascadeforwardnet/feedforwardnet without much difference) and have two hidden layers, each with 10 neurons. I've been using `trainbr` because it has given me better results than `trainlm`. I'm trying to normalize or standardize the data but haven't had much success. I know that fitnet uses mapminmax by default and I've seen Greg Heath's suggestion that I use zscore to standardize first. The problem is, every time I've used the zscore standardization I haven't gotten very good neural network results. My output needs to be completely positive after de-standardization yet I still get negative values. Because of this, I have used log10 to normalize the data, therefore keeping all of the values positive.

In order to see prediction error, I have found the maximum percent error at any individual output point. I cannot get error lower than 40%, and there are multiple other points with decently high error.

Is there anything else that I can do, whether it be normalization/standardization or network reconfiguration to improve my network performance?

EDIT:

I'm not sure if this is of any help but the regression plot shows that the R = 0.99984 so it seems very accurate.

Thank you for the help,

George

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Follow Question

Answer 1

Greg Heath le 14 Juil 2016

0 votes

George Tsitsopoulos about 5 hours ago Hi Greg,

>Why is it that MSE is the best measure of error and >the way I was calculating error is no good? Is it >because the toolbox focuses on minimizing that error >so me trying to minimize a different type of error >is of little help?

That's part of it. The other part is does percent error really make sense in a regression problem?

>I made the changes you suggested and run 10 trials >where each trial has some multiple of 3 hidden neurons >where the multiple is between 3 and 30. Each of these >trials builds & simulates 10 ANNs, each with a random >split of training/testing/validation data. I made it >so that the training data must be between 60% and 93% >of the total input data. I calculate the r squared and >output it to the below 10x10 matrix. Now that I have >this, I see that many of the values in the matrices >are above 0.99. How am I supposed to differentiate >between these values?

Ideally, N is sufficiently large so that the tst results are accurate and UNBIASED while the val results are relatively accurate and only SLIGHTLY BIASED.

Typically, the training goal I use is to minimize H subject to the constraint Rtrnsq > 0.99. I first obtain four 10x10 Rsq matrices for trn, val, tst and all. Next, these are reduced to four 4x10 matrices containing the min, median, mean and maximum Rsq vs H. Finally, the four rows of the four matrices are plotted.

As far as choosing one design, I would favor a net with Rsq > 0.99 at the smallest value of H.

>> Also, is there a way to bound my output so it is always positive? A negative output is impossible in the real world yet the neural net has several points that are output as negative.

>Using a bounded output transfer function will keep the output within bounds. Either TANSIG or LOGSIG will work. The scaling to your data will be done automatically.

>>Whether I use logsig or tansig as the hidden layer transfer function doesn't make a difference in output. I always end up having some values in the NN output be negative, which is impossible for what I'm trying to do. The only thing that's ever guaranteed my output be positive was using log10(target) before training/ simulating and then 10^target afterwards.

You have misinterpreted what I said. Change the OUTPUT TRANSFER FUNCTION!

Hope this helps.

Greg

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Answer 2

Greg Heath le 7 Juil 2016

Ouvrir dans MATLAB Online

2 votes

> I have an input matrix with two features.

 [ I N ] = size(input) % [ 2 N ], N = ?
 [ O N ] = size(target)% [ O N ], O = ?

> I currently use fitnet ... and have two hidden layers, each with 10 neurons.

 A single hidden layer is sufficient.
 With no regularization, number of "H"idden neurons, H, is 
 limited by the number of equations Neq = N*O

> I've been using trainbr`because it has given me better results than trainlm.

TRAINBR uses regularization which mitigates using a large H.

> I'm trying to normalize or standardize the data but haven't had much success. I know that fitnet uses mapminmax by default and I've seen Greg Heath's suggestion that I use zscore to standardize first.

I use zscore in order to detect outliers which may have to be modified or deleted.

> The problem is, every time I've used the zscore standardization I haven't gotten very good neural network results.

Perhaps you misused it. I cannot tell you how without details.

> My output needs to be completely positive after de-standardization yet I still get negative values. Because of this, I have used log10 to normalize the data, therefore keeping all of the values positive.

I don't think log10 is sufficient for positivity. The best way to impose output bounds is to use a bounded output transfer function like logsig or tansig

> In order to see prediction error, I have found the maximum percent error at any individual output point. I cannot get error lower than 40%, and there are multiple other points with decently high error.

If you are using fitnet, the default performance measure is MSE = mse(error). The corresponding scale free measures that are trivial to understand are

        NMSE = MSE/mean(var(target',1)
and
       Rsquare = 1 - NMSE

> Is there anything else that I can do, whether it be normalization/standardization or network reconfiguration to improve my network performance?

Using NMSE and Rsq are more reliable for measuring regression performance. I see no good reason for the log transformation

% EDIT: % % I'm not sure if this is of any help but the regression plot % shows that the R = 0.99984 so it seems very accurate.

Given your log transformation, I'm not sure just what that means. However, my guess is that it is good. Make sure by plotting unnormalized target and output on the same graph.

Hope this helps.

Thank you for formally accepting my answer

Greg

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

George Tsitsopoulos le 7 Juil 2016

Modifié(e) : George Tsitsopoulos le 7 Juil 2016

Ouvrir dans MATLAB Online

I figured that one hidden layer would be sufficient. I was trying multiple to see if I could get better results.

[ I N ] = size(input) % [ 2 N ], N = 1767
[ O N ] = size(target)% [ O N ], O = 1

How I would use zscore:

% zscore standardization
[normInputs,mean_i,std_i] = zscore(inputs');
[normTargets,mean_t,std_t] = zscore(targets');
% zscore destandardization after training/simulating network
NNoutput = mean_t + std_t.*NNoutput;

--EDIT--

I believe I used zscore wrong. I needed to use zscore before transposing the matrices. I now use zscore as follows:

% zscore standardization
  [normInputs,mean_i,std_i] = zscore(inputs);
  [normTargets,mean_t,std_t] = zscore(targets);
   normInputs = normInputs';
   normTargets = normTargets';
  % zscore destandardization after training/simulating network
  NNoutput = mean_t + std_t.*NNoutput;

--END EDIT--

How I would use log10:

% log10 normalization
normInputs = log10(inputs');
normTargets = log10(targets');
% log10 denormalization after training/simulating network
NNoutput = 10.^NNoutput;

The input and target matrices are transposed because they are input the wrong way. I currently use tansig as the transfer function for the hidden layer.

The way I have been checking the error is by plotting an error graph. I create the error matrix through this line of code

totalError = targetMat - NNoutputMat;

which occurs after I unnormalize the data. I then find the max value in this error matrix and record it. Here is an example of a figure I create after a run using the log10 transformation:

The two plots on the bottom show the error at each set of data i.e the error at each fit point of the 90% of points used for training and the error at each fit point of the 10% of points used for testing.

Greg Heath le 12 Juil 2016

Ouvrir dans MATLAB Online

> Unfortunately, there is still error around 2000% at certain points.

For regression, the only type of measure that makes sense is one that is linear w.r.t. MSE, the measure that you are trying to minimize directly. The most commonly used are the normalized mse

NMSE = mse(error)/mean(var(target',1))

and the corresponding R squared (see Wikipedia)

Rsq = 1 - NMSE

The denominator of NMSE is the smallest mse that could occur from the naive model output = constant. The minimizing solution is output = mean(target')'.

> You said that the number of hidden neurons is limited by the number of hidden equations. In your equation above, N=1767 and O=1 so I can have a maximum of 1767 neurons, correct?

No.

Using the default data division ratios of /0.7/0.15/0.15 yields the number of training examples

Ntrn = N - 2*round(0.15*N) %1237

and corresponding number of TRAINING equations

Ntrneq = Ntrn*O % 1237

To avoid the phenomenon of overfitting when using the default training function TRAINLM, keep the number of unknown weights

Nw = (I +1)*H+(H+1)*O

no greater than Ntrneq. However, for the purpose of numerical stability w.r.t. noise and measurement error, it should be considerably less.

Accordingly, Nw << Ntrneq yields the upper bound and reasonable maximum

H <= Hmax << Hub = (Ntrneq-O)/(I+O+1) = 309

Therefore I would probably first consider the 10 values

h = Hmin:dH:Hmax = 3:3:30

with

Ntrials = 1:10

random sets of initial weights and datadivisions each. Then display the 10x10 array of Rsq as I have done in many posted examples.

> You also said that bayesian regularization limits the number of hidden neurons further.

That is not what I meant. If you have to increase Hmax too much e.g., Hmax >~ Hub/2 ~ 155 then you should consider trainbr whos output is much more insensitive to overfitting.

> Could it be that 15 hidden neurons is too few?

That will be revealed as a product of the double loop search over h and Ntrials.

> Also, is there a way to bound my output so it is always positive? A negative output is impossible in the real world yet the neural net has several points that are output as negative.

Using a bounded output transfer function will keep the output within bounds. Either TANSIG or LOGSIG will work. The scaling to your data will be done automatically.

Hope this helps.

Greg

George Tsitsopoulos le 13 Juil 2016

Hi Greg,

Why is it that MSE is the best measure of error and the way I was calculating error is no good? Is it because the toolbox focuses on minimizing that error so me trying to minimize a different type of error is of little help?

I made the changes you suggested and run 10 trials where each trial has some multiple of 3 hidden neurons where the multiple is between 3 and 30. Each of these trials builds & simulates 10 ANNs, each with a random split of training/testing/validation data. I made it so that the training data must be between 60% and 93% of the total input data. I calculate the r squared and output it to the below 10x10 matrix. Now that I have this, I see that many of the values in the matrices are above 0.99. How am I supposed to differentiate between these values?

>> Also, is there a way to bound my output so it is always positive? A negative output is impossible in the real world yet the neural net has several points that are output as negative.

>Using a bounded output transfer function will keep the output within bounds. Either TANSIG or LOGSIG will work. The scaling to your data will be done automatically.

Whether I use logsig or tansig as the hidden layer transfer function doesn't make a difference in output. I always end up having some values in the NN output be negative, which is impossible for what I'm trying to do. The only thing that's ever guaranteed my output be positive was using log10(target) before training/simulating and then 10^target afterwards.

I appreciate all of your help,

George

Connectez-vous pour commenter.

Struggling to Improve Neural Network Performance

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Plus de réponses (1)

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Catégories

Produits

Tags

Community Treasure Hunt

Struggling to Improve Neural Network Performance

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Réponse acceptée

0 commentaires Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

Plus de réponses (1)

5 commentaires Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens

Catégories

Produits

Tags

Voir également

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciens Masquer -2 commentaires plus anciens

5 commentaires
Afficher 3 commentaires plus anciens Masquer 3 commentaires plus anciens