Back propagation algorithm of Neural Network : XOR training

c=0;
wih = .1*ones(nh,ni+1);
who = .1*ones(no,nh+1);
while(c<3000)
c=c+1;
for i = 1:length(x(1,:))
for j = 1:nh
netj(j) = wih(j,1:end-1)*double(x(:,i))+wih(j,end)*1;
outj(j) = 1./(1+exp(-1*netj(j)));
end
% hidden to output layer
for k = 1:no
netk(k) = who(k,1:end-1)*outj'+who(k,end)*1;
outk(k) = 1./(1+exp(-1*netk(k)));
delk(k) = outk(k)*(1-outk(k))*(t(k,i)-outk(k));
end
% back proagation for j = 1:nh s=0; for k = 1:no s = s+who(k,j)*delk(k); end
delj(j) = outj(j)*(1-outj(j))*s;
s=0;
end
for k = 1:no
for l = 1:nh
who(k,l)=who(k,l)+.5*delk(k)*outj(l);
end
who(k,l+1)=who(k,l+1)+1*delk(k)*1;
end
for j = 1:nh
for ii = 1:ni
wih(j,ii)=wih(j,ii)+.5*delj(j)*double(x(ii,i));
end
wih(j,ii+1)=wih(j,ii+1)+1*delj(j)*1;
end
end
end
// The code above, I have written it to implement back propagation neural network, x is input , t is desired output, ni , nh, no number of input, hidden and output layer neuron. I am testing this for different functions like AND, OR, it works fine for these. But XOR is not working.
// Training x = [0 0 1 1; 0 1 0 1] // Training t = [0 1 1 0]
// who -> weight matrix from hidden to output layer
// wih -> weight matrix from input to hidden layer
// Can you help ?

3 commentaires

It would help immensely if you posted a code that could be cut, pasted, run, and yield a numerical answer. Then we could begin by matching our results with yours.
It would also help immensely if you replaced the loops with vectorized code.
Greg
If you initialized weights randomly, you could see if it is
an initialization problem.
Have you noticed the loop accidentally included in the backpropagation comment?
Greg
how this code work XOR?

Connectez-vous pour commenter.

 Réponse acceptée

close all, clear all, clc
x = [0 0 1 1; 0 1 0 1]
t = [0 1 1 0]
[ni N] = size(x)
[no N] = size(t)
nh = 2
% wih = .1*ones(nh,ni+1);
% who = .1*ones(no,nh+1);
wih = 0.01*randn(nh,ni+1);
who = 0.01*randn(no,nh+1);
c = 0;
while(c < 3000)
c = c+1;
% %for i = 1:length(x(1,:))
for i = 1:N
for j = 1:nh
netj(j) = wih(j,1:end-1)*x(:,i)+wih(j,end);
% %outj(j) = 1./(1+exp(-netj(j)));
outj(j) = tansig(netj(j));
end
% hidden to output layer
for k = 1:no
netk(k) = who(k,1:end-1)*outj' + who(k,end);
outk(k) = 1./(1+exp(-netk(k)));
delk(k) = outk(k)*(1-outk(k))*(t(k,i)-outk(k));
end
% back propagation
for j = 1:nh
s=0;
for k = 1:no
s = s + who(k,j)*delk(k);
end
delj(j) = outj(j)*(1-outj(j))*s;
% %s=0;
end
for k = 1:no
for l = 1:nh
who(k,l) = who(k,l)+.5*delk(k)*outj(l);
end
who(k,l+1) = who(k,l+1)+1*delk(k)*1;
end
for j = 1:nh
for ii = 1:ni
wih(j,ii) = wih(j,ii)+.5*delj(j)*x(ii,i);
end
wih(j,ii+1) = wih(j,ii+1)+1*delj(j)*1;
end
end
end
h = tansig(wih*[x;ones(1,N)])
y = logsig(who*[h;ones(1,N)])
e = t-round(y)
Hope this helps.
Greg

5 commentaires

Hi Greg, Thank you so much for the effort and time you have put on this. I am really grateful to you.
But the problems still remains, your modification (basically random initialization of weights and using tan sigmoid for hidden layer) sometimes works sometimes not. And I must know your motivation for not using logsig everywhere. Are there any explanations?
I want to share the whole code which is now in better shape. And you are right about the vectorization, I will do that once the code is working.
Here is two functions one for training and another for simulation.
You can try to run like this, and see sometimes it works sometimes not, I have added your suggestion in code.
>> net = biasBackProp(2,3,1,1,0.0001,[0 0 1 1;0 1 0 1],[0 1 1 0])
>> bpSim([0;1],net)
>> bpSim([1;1],net)
>> bpSim([1;0],net)
>> bpSim([0;0],net)
The functions can be copied from here,
https://docs.google.com/document/d/18kGXAgeVP1kOlJHZJ2KmV_mkxY61f255LQjPcr6uwnI/edit
https://docs.google.com/document/d/13YVWHUXug6XrngQD3a52IgzPxMH2M9sgECkb7peRckA/edit
Once again thank you and please be in touch, I will accept your answer later.
Greg Heath , Thank you so much for the effort on this.
i want to ask you plz .. how can i use Multilayer perceptron to make classification to some images ??
my teature told me to make first XOR gate to make sure that the algorithm working .
and when i research i found your code , is that code suitable to what i want to do ??
cause i try to research alot about matlab code ?
how can i use it ??
this code using one hidden layer , how can i use 2 or 3 hidden layer to make xor ?
Greg Heath, Why Doesn't this code work for 3 input XOR ?? If I replace the X and Y with 3 inputs then the error does not converge to 0 !!
  1. I ALWAYS use the bipolar tanh in hidden layers. It ALWAYS works.
  2. What are x and t for 3 input XOR???
Greg

Connectez-vous pour commenter.

Plus de réponses (5)

Greg Heath
Greg Heath le 27 Jan 2012

0 votes

It is well known that successful deterministic training depends on a lucky choice of initial weights. The most common approach is to use a loop and create Ntrial (e.g., 10 or more) nets from different random initial weights. Then choose the best net.
It is also well known that an odd bounded monotonically increasing activation function like TANSIG is the choice of preference for hidden layers because it does not restrict the polarity of the layer variables. It works even better when the input is shifted to have zero mean.
You can check the superiority of TANSIG and zero-mean yourself. You can also search the comp.ai.neural-nets FAQ and archives to find both agreement and numerical experiments.
For most real world problems the best choice for number of hidden nodes, H, is not known apriori. That is why I have posted many examples using a double loop: An outer loop over H and an inner loop over Ntrials random weight initializations. For examples, search the newsgroup using the keywords
heath clear Ntrials
Hope this helps.
Greg

3 commentaires

Thanks, you are awesome :)
Thank you very much a nice example of MLP BP NN and very easy to understand. Though I am a novice in this field but I am now clear in the programming idea
Havot Albeyboni
Havot Albeyboni le 9 Déc 2020
Modifié(e) : Havot Albeyboni le 9 Déc 2020
can anyone please explain this line ???
netj(j) = wih(j,1:end-1)*x(:,i)+wih(j,end);
shouldnt it be i.e ->> Y3=sigmoid(X1W13+ X2W23 - θ3 )
rgrds

Connectez-vous pour commenter.

Imran Babar
Imran Babar le 8 Mai 2013

0 votes

Dear sir I want to use the same code for the following data set
Input dataset=[1 1 1 2;1 1 2 2;1 2 2 2; 2 2 2 2] Output=[5 6 7 8]
but it is always generating output as given below
1 1 1 1
I tried my best but unable to understand how may I get these results

1 commentaire

Your outputs are not within the range of logsig.
Either normalize your outputs to fit in {0,1)
or
change your output activation function (e.g., 'purelin')

Connectez-vous pour commenter.

Sohel Ahammed
Sohel Ahammed le 4 Juil 2015
Ok. If i Want to test it, how i have to change. Ex: input : 1 0 expected output : 1 (From learing).
dsmalenb
dsmalenb le 17 Oct 2018
Am I missing something here but I don't see any bias neurons. Maybe this is why you are getting some inputs to work and others not?

5 commentaires

Biases are included in my equations (see the "ones")
Greg
Ah! But why are they 1's? Should they now vary as well?
The ones are multiplied by the bias weights which are automatically learned with the others.
I'm sorry but that statement does not make much sense to me. Biases are added to shift the values within the activation function. Multiplying by 1 does nothing.
The 1 is a placeholder which is multiplied by a learned weight.
Hmm, I've been using that notation for decades and this is the 1st question re that that I can remember.
Greg

Connectez-vous pour commenter.

Catégories

En savoir plus sur Deep Learning Toolbox dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by