Divide Data for Optimal Neural Network Training
This topic presents part of a typical multilayer network workflow. For more information and other steps, see Multilayer Shallow Neural Networks and Backpropagation Training.
When training multilayer networks, the general practice is to first divide the data into three subsets. The first subset is the training set, which is used for computing the gradient and updating the network weights and biases. The second subset is the validation set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. The network weights and biases are saved at the minimum of the validation set error. This technique is discussed in more detail in Improve Shallow Neural Network Generalization and Avoid Overfitting.
The test set error is not used during training, but it is used to compare different models. It is also useful to plot the test set error during the training process. If the error on the test set reaches a minimum at a significantly different iteration number than the validation set error, this might indicate a poor division of the data set.
There are four functions provided for dividing data into training, validation and test
sets. They are dividerand
(the default), divideblock
, divideint
, and divideind
. The data division is normally performed automatically when you
train the network.
Function | Algorithm |
---|---|
Divide the data randomly (default) | |
Divide the data into contiguous blocks | |
Divide the data using an interleaved selection | |
Divide the data by index |
You can access or change the division function for your network with this property:
net.divideFcn
Each of the division functions takes parameters that customize its behavior. These values are stored and can be changed with the following network property:
net.divideParam
The divide function is accessed automatically whenever the network is trained, and is
used to divide the data into training, validation and testing subsets. If
net.divideFcn
is set to '
dividerand
'
(the default), then the data is randomly
divided into the three subsets using the division parameters
net.divideParam.trainRatio
,
net.divideParam.valRatio
, and
net.divideParam.testRatio
. The fraction of data that is placed in the
training set is
trainRatio
/(trainRatio+valRatio+testRatio
), with a
similar formula for the other two sets. The default ratios for training, testing and
validation are 0.7, 0.15 and 0.15, respectively.
If net.divideFcn
is set to '
divideblock
'
, then the data is divided into three subsets
using three contiguous blocks of the original data set (training taking the first block,
validation the second and testing the third). The fraction of the original data that goes
into each subset is determined by the same three division parameters used for dividerand
.
If net.divideFcn
is set to '
divideint
'
, then the data is divided by an interleaved
method, as in dealing a deck of cards. It is done so that different percentages of data go
into the three subsets. The fraction of the original data that goes into each subset is
determined by the same three division parameters used for dividerand
.
When net.divideFcn
is set to '
divideind
'
, the data is divided by index. The indices for
the three subsets are defined by the division parameters
net.divideParam.trainInd
, net.divideParam.valInd
and net.divideParam.testInd
. The default assignment for these indices is
the null array, so you must set the indices when using this option.