Hello! I apologize for not posting part 1 sooner. I know I paraded on about free time in the comment section,
but unfortunately some personal events prevented me from posting earlier. Also, this is gonna be
approached differently than I said it was in part 0. I feel this would be more productive than going
on immediately with the other problem. Anyways, enjoy!
A good way to think about these systems is thinking of them as circuits, whose values are real
numbers, not boolean. So, this network would be like a logic gate using functions like OR, AND,
etc., however it instead uses functions that operate on the real line. Why do I bring
this up? Well, I saw this represented this way elsewhere, and I thought it was a well suited way
to think of these systems. So, there are multiple kinds of neural networks, such as feed-forward,
back propagated, etc. So as we remember, a perceptron is a model of a single neuron, with some number
of inputs with their weights. Then ofcourse, the sums of the products of the weights and inputs are
normalized with the activation function, typically the sigmoidal function, the inverse of the sum
of one and e raised to the negated input ( 1/(1+e^-x) ).
Also, one thing we must know with the sigmoidal activation function is that a return value like 0.999
may as well be considered 1, since the function only asymptotically approaches 1. Large positive
values are brought down near 1 and large negative values are brought down to 0. But we use the Sigmoidal
function becuase it is continuous in its domain, so we can perform calculus on it. There are other
functions used to normalize, or activation functions, out there that are used, such as the step functions.
Let’s look at a quick perceptron to solve OR.
http://pastebin.com/u7nyrufC
Here, we have a simple single neuron network to do the OR function. Notice that False values are given out by the sigmoidal function as 0.5, while True values are larger than that. Since we know this, we can just make the final program give out an answer of True or False based on the numerical reply. As long as results from the network are consistent, we can interprete them as we see fit. Now, in most applications, neural networks with multiple layers of neurons are used, not just single neurons.
There are multiple styles of training… The kind we are currently focused on, supervised learning, is all about giving the network the expected outcome, along with inputs. Kind of like using flash cards to study, you get the input, you decide on your answer, then you flip the card, telling you how wrong you are. So, let’s use this kind of learning to train an AND neural network. Remember the calculations we talked about before? We’ll be using those. However, we do not always need to use the sigmoidal function as an activation function, keep in mind. For this example, we will actually be using the step function. We’ll be using a step function, which takes any input, and returns either 1 or 0 based on whether or not the input is larger than a certain threshold t. This time, we’ll be using a different weight change calculation, and we’ll be using “learning rates”, which are values affect the size of the values we adjust the weights by, therefore also affecting how fast the neural network learns. Perhaps this method may be more effective than the one we previously went over? Maybe less? Anyways, we’ll be calculating the weight change with learning_rate*(target-output)*input. Let’s train an AND function. For the initial weights, we will generate some randomized real numbers between 0 and 1. So here’s a simple example of training the AND net.
Works nicely. As you can see, the step function is useful for these kinds of binary functions. Now, when training a neural network for NOR and using the step function, we run into a problem. Notice that for any 2 weights (0)*w1 + (0)*w2 >= 1 is never a true statement. We fix this by adding a bias input. What is a bias input? It is simply an input with a constant value. We will see this more in part 2.
I know this one is a bit short, but I rather put this one out instead of waiting. Like I said, I’ve been a little more busy than expected :P. In future parts, we’re gonna cover other algorithms, like SVMs and whatnot. We’ll go on there. so I hope you come to find this series interesting or enjoyable ;).