Bias problem in my Artificial neural network

Hello everyone.

I wrote my own artifical neural network class and in some reason
when i add bias neuron to a middle layer where hidden neurons are then it doesn't work. It doesn't get the right solution.

I have debugged it enough and everything seems fine.
Here's the model i used to create the code:
http://www.upload.ee/image/5591007/1.png

Perhaps i made a mistake somewhere and i can't see it?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
class ti
{
public:
	vector<double> in;
	ti(int n, ...)
	{
		va_list vl;
		va_start(vl, n);
		for (int i = 0; i<n; i++)
		{
			in.push_back(va_arg(vl, double));
		}
		va_end(vl);
	}
};

class tr
{
public:
	vector<double> results;
	tr(int n, ...)
	{
		va_list vl;
		va_start(vl, n);
		for (int i = 0; i<n; i++)
		{
			results.push_back(va_arg(vl, double));
		}
		va_end(vl);
	}
};

class learn
{
public:
	vector<ti> in;
	vector<tr> results;
	learn(int n, ...)
	{
		va_list vl;
		va_start(vl, n);
		for (int i = 0; i<n; i++)
		{
			in.push_back(va_arg(vl, ti));
			results.push_back(va_arg(vl, tr));
		}
		va_end(vl);
	}
};

int _tmain(int argc, _TCHAR* argv[])
{
	nn::brain b;
	vector<nn::coord> in;
	vector <nn::coord> out;
	in.push_back(b.AddNeuron(0, nn::nt::input));
	in.push_back(b.AddNeuron(0, nn::nt::input));
	b.AddNeuron(0, nn::nt::bias);
	b.AddNeuron(1, nn::nt::hidden, 3);
	// b.AddNeuron(1, nn::nt::bias); <--- this one is the middle one
	out.push_back(b.AddNeuron(2, nn::nt::output));
	b.connectlayers(0, 1);
	b.connectlayers(1, 2);
	learn l( 4, ti(2, 1.0, 1.0), tr(1, 0.0), ti(2, 1.0, 0.0), tr(1, 1.0), ti(2, 0.0, 1.0), tr(1, 1.0), ti(2, 0.0, 0.0), tr(1, 0.0));

	int iTry = 0;
	int a = 0;
	int skip = 0;
	while (1)
	{

		if (a >= l.in.size()) a = 0;

		if (skip <= 0)
		{
			cout << "Try:" << iTry << endl;
			cout << "value:";
			for (int i = 0; i < l.in[a].in.size(); i++)
			{
				b.setX(in[i], l.in[a].in[i]);
				cout << l.in[a].in[i] << " ";
			}
			cout << endl;

			cout << "target:";
			for (int i = 0; i < l.results[a].results.size(); i++)
			{
				b.setX(out[i], l.results[a].results[i]);
				cout << l.results[a].results[i] << " ";
			}
			cout << endl;
			b.ActivateX();

			cout << "results:";
			for (int i = 0; i < out.size(); i++) {
				cout << b.getX(out[i]) << " ";
			}
			cout << endl;
			b.ActivateY();
			// b.PrintFull();
			system("pause");
			cout << endl << endl << endl;
		}
		else
		{
			for (int i = 0; i < l.in[a].in.size(); i++){
				b.setX(in[i], l.in[a].in[i]);
			}

			for (int i = 0; i < l.results[a].results.size(); i++) {
				b.setX(out[i], l.results[a].results[i]);
			}
			b.ActivateX();
			b.ActivateY();
		}

		iTry++;
		a++;
		skip--;
	}

	system("pause");
	return 0;
}


brain.h:
http://pastebin.com/fKCe9UfJ

Apologizes about my code style.
It seems horrible and if anyone has a better idea how to struct the class
then please let me know.

Thanks!
Last edited on
¿have you considered inheritance and polymorphism?
However, I don't understand why you need so many kinds of neurons. There is no conceptual difference between a neuron on input layer, hidden or output layer.

You may benefit by creating a network and a layer class.


About your code, please make sure to provide everything that we may need to reproduce your issue. That includes example input and output.
error: `learn' was not declared.
Oh god I forgot those Learn classes.

In my code, the difference between hidden and and a output layers is the way the gradients are calculated, Also the output layer doesn't have connections or as i called it in my code 'Axons'

Each type of neuron needs different variables.
1
2
3
4
5
6
Output needs:
vector<axon> axons
double target
double out
double gradient
double WeightSum

Note: WeightSum is needed because the structure of my class and way of calculating an output. Since each layer may not be fully connected, i felt a need to include that variable
because i couldn't find a better solution for calculating output that way.

1
2
3
Input needs:
vector<axon> axons
double input


I am assuming that I could save memory by making different data types for different neurons.
The best way i could came up with is
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
struct dataone{
double a
double b
};

struct datatwo{
double a
};

class myclass
{
void *datax
myclass( int datatype )
{
switch(datatype)
{
case 1:
datax = new dataone();
break;
case 2:
datax = new datatwo();
break;
} 
}
};


I knew existent of inheritance and polymorphism but i haven't used them ever
and also i don't really know how to apply them so that my data situation would be better,
Based on these examples:
http://www.tutorialspoint.com/cplusplus/cpp_inheritance.htm
http://www.cplusplus.com/doc/tutorial/polymorphism/

In my neural network class,
one neuron could be conneccted to a neuron what is further than one layer away.
It only makes sense when first neuron is connected to higher layer neuron.

I want to experiment few things.
I invented a memory neuron what works in followings:
Layer 0: input, memory1<target:input>, memory2<target:memory1>

the first feedforward with input being 1 would set their values followings:
layer 0: input<1>, memory1<1>, memory2<0>

second feedforward with input being 2:
layer 0: input<2>, memory1<2>, memory2<1>

First memory neuron always just copies the value.
The reason why first memory neuron just copies the current value and not having old value is cause i couln't work out the code for it to work.


About bias being in middle layer.
I think that each layers must or could have bias without going crazy like my network does
cause i had a network where all layers where fully connected and each layer having one bias.
( expect the last layer, cause it wouldn't make sense )

I debugged the gradients and everything, the way weight is calculated for last bias but
i think there might be something wrong with my model here:
http://www.upload.ee/image/5591007/1.png

While getting to calculate last bias weight then variables having the value they should have.
1
2
delta = 1.0 * eta * gradient + alpha * delta;
weight += delta;

Gradient will be output neuron's one, i tested and printed out values and it was how it should.
I think the output gradient is calculate the right way as well.
I'm lost, everything seems right in my code.

Removing the last bias makes everything work fine again but that bias needs to be there because what if i'm having a problem for a neural network where hidden layer needs bias?

Also i will be using something what will be pretty much randomly create neurons and layers and it would remove neurons or layers what doesn't seem to belong or in case something else is needed it is a random type of neuron and sees if it made it better.
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class neuron{
   //whatever variables you think there are commons to everyone
public:
   virtual ~neuron() = default; //important if you pretend to use polymorphism
   //operations
   void connect(neuron *with);
   virtual void Activate{W,X,Y}(); //behaviour to be changed by derived classes
   void fireX(double signal){ //this code is common and derived classes will not change it, it is not virtual
      this->xsum += signal;
   }
   //...
};

class Bias: public neuron{
  //whatever variables that it may need and that are not in the parent
public:
   virtual void Activate{W,X,Y}() override{
      //its own behaviour
      for(int K=0; K<this->axons.size(); ++K)
         this->axons[K].delta = 
            this->GetX() * eta * this->axons[K].x->GetY()
            + this->axons[K].delta * alpha;
   }
};

int main(){
   neuron *foo = new Bias(/**/);
   neuron *bar = new Memory(/**/);

   foo->ActivateW(); //calls Bias::ActivateW()
   bar->ActivateW(); //calls Memory::ActivateW()
}
that would make your switchs unnecessary, organize it better and avoid all those castings.

Your code as now is too hard to follow.


¿what are X,Y for a neuron?
¿what is the problem that your network should solve?
¿where is training and test?
¿what should happen with your code? (as now it never ends, ¿how you can notice if the network is working?)
let's go back a little.

> when i add bias neuron to a middle layer
my understanding was that the bias is an extra weight associated to a 1 input.
So the output of the neuron instead of simply w_j in_j would be w_j in_j + bias

If that's the case you don't need a whole network to test, just one neuron.
i printed out values and each feedforward cycle the console asks me to press enter to continue. Unless i change skip variable value.

The problem was
1,0 = 1
0,1 = 1
0,0 = 0
1,1 = 0

about x, y for neuron.
Its uncertain for now.

I will write my whole code from 0 and make it more understandable and add comments as well.
but, neuron has double x and double y
x standed for output and y for gradient.


I have few questions about the code you posted.
I also found now the example of that
http://www.cplusplus.com/doc/tutorial/polymorphism/

I think i didn't scroll don't cause i thought they were comments from other people.
About your code.

 
virtual void Activate{W,X,Y}() override{


What are W,X,Y and what does overide stands for?

After i copy-paste my code to visual studio it says error 'virtual now allowed'
and for that {W,X,Y} part is said 'expected a ';' '

I use visual studio 2015
that's just an abbreviation, the functions are
ActivateW(), ActivateX(), ActivateY()

`override' means that there is a virtual function in the parent class and that you are going to change its behaviour. It's from the new standard and actually not needed, but it may make the code clearer.

The snip was just to show how it may be used, but it wasn't a working example.


> x standed for output and y for gradient.
use meaningful variable names.
You could named them `output' and `gradient'
> use meaningful variable names.
> You could named them `output' and `gradient'
Will do.

I have one more question.
Does class what has polymorphism in it, take more memory than normal class?
Doesn't it needs to store its type somewhere or something?
I'm not familiar with implementation details.
afaik, there is a virtual table for each class that contains the member functions and the object point to it.
So yes, it would have overhead: one pointer per object (it doesn't matter how many member functions you have or how deep in the inheritance you are)
I recoded everything yet that bias problem slipped in again somehow.
It is now rather obvious that my understanding of neural network model is wrong.

Main code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
int _tmain(int argc, _TCHAR* argv[])
{
	// classic XOR problem
	// 0, 0 = 0
	// 1, 1 = 0
	// 0, 1 = 1
	// 1, 0 = 1
	nn::TrainingData trainingdata(	4,
		nn::input(2, 0.0, 0.0), nn::target(1, 0.0),
		nn::input(2, 1.0, 1.0), nn::target(1, 0.0),
		nn::input(2, 0.0, 1.0), nn::target(1, 1.0),
		nn::input(2, 1.0, 0.0), nn::target(1, 1.0)
	);

	nn::brain NeuralNetwork;

	// vector of input neurons
	vector<nn::neuron *> input;

	// vector of output neurons
	vector<nn::neuron *> output;

	// creating neurons
	input.push_back(NeuralNetwork.AddNeuron(0, nn::nt::input));
	input.push_back(NeuralNetwork.AddNeuron(0, nn::nt::input));
	NeuralNetwork.AddNeuron(1, nn::nt::hidden, 3); // adding 3 hidden neurons into layer index:1
	output.push_back(NeuralNetwork.AddNeuron(2, nn::nt::output));
	NeuralNetwork.AddNeuron(0, nn::nt::bias, 1);

	// My neural network doesn't seem to get XoR problem if i add bias to layer(index:1)
	// Fully connected neural networks always has bias neuron in each layer but mine just doesn't seem to work
	// I tried creating fully connected neural network with my class yet it seems to fail.
	// If i comment out the next line then neural network will solve the problem fine.
	NeuralNetwork.AddNeuron(1, nn::nt::bias, 1);


	// connecting neurons by connecting layers
	NeuralNetwork.ConnectLayers(0, 1);
	NeuralNetwork.ConnectLayers(1, 2);

	int SkipPrint = 10000; // Number of how many iterations will be passed without prinding value and pausing the loop


	int iterations = 0;
	int DataId = 0;
	while (1)
	{
		if (SkipPrint <= 0)
		{
			DataId = iterations % trainingdata.DataCount();
			cout << "Try:" << iterations << endl;

			cout << "Input:";
			for (int a = 0; a < trainingdata.InputCount(); a++) {
				cout << trainingdata.GetInput(DataId, a) << "  ";
				input[a]->setOutput(trainingdata.GetInput(DataId, a));
			}
			cout << endl;

			cout << "Target:";
			for (int a = 0; a < trainingdata.TargetCount(); a++) {
				cout << trainingdata.GetTarget(DataId, a) << "  ";
				output[a]->SetTarget(trainingdata.GetTarget(DataId, a));
			}
			cout << endl;

			NeuralNetwork.FeedForward();

			cout << "Results:";
			for (int a = 0; a < output.size(); a++) {
				cout << output[a]->GetOutput() << "  ";
			}
			cout << endl;

			NeuralNetwork.BackPropagation();
			system("pause"); // pause the loop until enter is pressed
			cout << endl << endl << endl;
		}
		else
		{
			DataId = iterations % trainingdata.DataCount();
			for (int a = 0; a < trainingdata.InputCount(); a++) {
				input[a]->setOutput(trainingdata.GetInput(DataId, a));
			}

			for (int a = 0; a < trainingdata.TargetCount(); a++) {
				output[a]->SetTarget(trainingdata.GetTarget(DataId, a));
			}

			NeuralNetwork.FeedForward();
			NeuralNetwork.BackPropagation();
			SkipPrint--;
		}
		iterations++;
	}

	system("pause");
	return 0;
}


My neural network's class code still don't fit inside the post, i'm sorry.
Here's link to code: http://pastebin.com/nuAU34bb

Hopefully my code is now much more readable and not that painful to look at.
I have looked formulas online and codes for hundreds of times, I just can't see what I'm doing wrong.

Edit:

Question about bias neuron.
Does it weight must change?

What if i lose the bias neuron and instead
1
2
3
4
5
6
7
        void Activate() {
            this->setOutput(tanh(this->sum + 1.0)); // adding bias value here
            this->sum = 0;
            for (int a = this->HaveMemoryNeuron() == true ? 1 : 0; a < this->axons.size(); a++) {
                this->axons[a]->target->Fire(this->output * this->axons[a]->GetWeight());
            }
        }


Would that be correct way to do it?

I could lose bias neurons because they don't need weight, they don't have gradient or it doesn't need an output variable cause its always 1.0
Last edited on
> Question about bias neuron.
> Does it weight must change?
Suppose a perceptron. Its output is the dot product of the input and the weights w_j x_j
You take the sign of that output to make a classification. So the classification limit is w_j x_j = 0

Now, let's say that you have two inputs: (x,y). The limit then becomes ax + by = 0
That's the equation of a straight line that passes through the origin.


Then, you add bias. Your input can be considered as (x,y,1) and the limit is now ax + by + c = 0, which is the general equation of a straight line.
No longer you need to be limited to the origin, now the `c' parameter (the bias) determines where you cut the axis.

If your problem may be solved without bias, then by using it, it should be expected that it will approach to 0.
If that doesn't happen, then it may be a problem with its training or you've got stuck in a local minimal.
Here's a diagram about my new neural network.
Instead of just showing a complicated code what is hard to read i decided to make a diagram
and put only little of code inside of it.
http://i.stack.imgur.com/h08LH.jpg

Is everything correct there?
Last edited on
Your input neuron seems useless and I don't see a difference between hidden and output.

The back-propagation looks correct, the bias and the weights are updated on the same way.

Don't remember about Nguyen Widrow, so won't comment on that.

About your structure, for the XOR problem 2 hidden neurons should suffice http://images.slideplayer.com/9/2412423/slides/slide_8.jpg
As you may see in the diagram, the hidden neurons create the lines
w_j x_j + bias and the output layer would just select a region. Use that knowledge to debug the training, look at how the lines get updated and if they converge.
May also remove the momentum term, simplify as much as possible.


PS: ¿how did you create your graphics? If it is with latex, I would like to have the source code.
> Your input neuron seems useless

I agree but in diagram, i created them on purpos so one could imagine if
input neurons were hidden.

I do still need input neuron in my code cause one is holding vector of connections and the input value or the value what is going into hidden neurons.
The structure of my neural network needs that neuron1 out going connections
must be inside neuron1 class.

> I don't see a difference between hidden and output
There is no difference between hidden and output neurons if we are looking at model however
while creating network in c++ i discover that output and hidden share different variables.

Output neuron has the target value stored in them.
Ofcourse i can replace this variable by calculating the error right the way when target is known.
Then there are no needs to hold target variable.

Difference 2. would be that hidden has outcoming connection vector but output neuron doesn't have one.

> About your structure, for the XOR problem 2 hidden neurons should suffice
I know that the xor problem doesn't need 3 neurons however if we don't have bias neuron, then we need 3 in order it to work.

I was testing with/without bias so many times that I forgot about it.

> PS: ¿how did you create your graphics? If it is with latex, I would like to have the source code.

I used program called Diagram Designer 1.28 (2015) by MeeSoft.
Just in case, here's the program's project file:
http://www.upload.ee/files/5636380/my_vision_of_ann.ddd.html
Topic archived. No new replies allowed.