Three layers are sufficient: input/hidden/output
The input layer is NOT a neuron layer. The number of input nodes is the dimensionality of the input data.
The output layer IS a neuron layer. The number of nodes is the dimension of the target and output data.
This is sufficient for trivial transformations.
In general, a hidden layer with nonlinear neurons is required. Typically, the more complicated the I/O transformation,
the more hidden neurons are required.
Sometimes it is better to use two or more hidden layers instead of having a single hidden layer with a huge number of neurons.
I find the best approach is to minimize the number of trained weights subject to a limit on the output error.
The limit I typically choose is that the output error is at least 100 times smaller than the average variance of the target.
The latter is the minimum error that results if you guess that the output is just the target mean.
I have posted jillions of examples and tutorials in both the NEWSGROUP and ANSWERS.
Hope this helps.
THANK YOU FOR FORMALLY ACCEPTING MY ANSWER