Have you ever wondered how neural networks work? Today, let's uncover the mystery of neural networks and build a simple one from scratch using NumPy, a powerful mathematical library. Don't worry, I'll use easy-to-understand language and guide you step by step through this wonderful world. Are you ready? Let's begin this exciting programming journey!
Preparation
First, we need to import the necessary library. Here we only need NumPy:
import numpy as np
NumPy is the fundamental library for scientific computing in Python. It provides high-performance multidimensional array objects and tools for working with these arrays. In building neural networks, NumPy's array operations and mathematical functions will play a crucial role.
Data Preparation
Before we start building the neural network, we need to prepare some data. For simplicity, we'll create a simple classification problem: predicting whether a person will buy a certain product based on their age and estimated income.
X = np.array([[22, 50000], [25, 60000], [35, 70000], [45, 90000], [55, 110000]])
y = np.array([[0], [0], [1], [1], [1]])
Here, X
is our input data, where each row represents a person and contains two features: age and estimated income. y
is our target output, where 0 means not buying and 1 means buying.
You might ask, "Why are we using this kind of data?" Well, imagine you're a marketer, you might find that older people with higher incomes are more likely to buy certain products. Our neural network is going to learn this pattern!
Neural Network Architecture
Now, let's design our neural network. We'll create a simple feedforward neural network with an input layer, a hidden layer, and an output layer.
class NeuralNetwork:
def __init__(self, input_size, hidden_size, output_size):
self.W1 = np.random.randn(input_size, hidden_size)
self.b1 = np.zeros((1, hidden_size))
self.W2 = np.random.randn(hidden_size, output_size)
self.b2 = np.zeros((1, output_size))
Here, W1
and W2
are our weight matrices, and b1
and b2
are bias terms. We initialize the weights with random values, which is a common practice. Why do we do this? Because if we initialize all weights with the same value, the neural network wouldn't be able to learn different features!
Activation Function
Neural networks need non-linear activation functions to learn complex patterns. We'll use the sigmoid function as our activation function:
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(x):
return x * (1 - x)
The sigmoid function maps any real number to a value between 0 and 1, making it very suitable for binary classification problems. You can think of it as a "compressor" that squeezes all inputs between 0 and 1.
Forward Propagation
Forward propagation is the process by which the neural network processes input data. Let's implement it:
def forward(self, X):
self.z1 = np.dot(X, self.W1) + self.b1
self.a1 = sigmoid(self.z1)
self.z2 = np.dot(self.a1, self.W2) + self.b2
self.a2 = sigmoid(self.z2)
return self.a2
This process is like information flowing through the neural network. Input data is multiplied by weights, added to biases, and then passed through the activation function. This process repeats at each layer until we get the final output.
Backward Propagation
Backward propagation is the core of neural network learning. It calculates the gradients of the loss function with respect to the network parameters and uses these gradients to update the parameters:
def backward(self, X, y, output):
self.dz2 = output - y
self.dW2 = np.dot(self.a1.T, self.dz2)
self.db2 = np.sum(self.dz2, axis=0, keepdims=True)
self.dz1 = np.dot(self.dz2, self.W2.T) * sigmoid_derivative(self.a1)
self.dW1 = np.dot(X.T, self.dz1)
self.db1 = np.sum(self.dz1, axis=0)
Backward propagation might look a bit complex, but its core idea is actually quite simple: we start from the output layer, calculate the error for each layer, and then use these errors to adjust the weights and biases. It's like the network saying, "Oh, I made a mistake. Let me adjust my parameters, and I'll do better next time!"
Training Process
Now, let's put all the pieces together and implement the training process:
def train(self, X, y, epochs, learning_rate):
for _ in range(epochs):
output = self.forward(X)
self.backward(X, y, output)
self.W1 -= learning_rate * self.dW1
self.b1 -= learning_rate * self.db1
self.W2 -= learning_rate * self.dW2
self.b2 -= learning_rate * self.db2
In each training epoch, we perform forward propagation, then backward propagation, and finally update the weights and biases. This process is repeated multiple times, with the network slightly adjusting its parameters each time, gradually improving its prediction accuracy.
Using Our Neural Network
Let's create a neural network instance and train it:
nn = NeuralNetwork(2, 4, 1)
nn.train(X, y, epochs=10000, learning_rate=0.1)
test_data = np.array([[30, 65000]])
prediction = nn.forward(test_data)
print(f"Prediction for age 30 and income 65000: {prediction[0][0]}")
Here, we create a neural network with 2 neurons in the input layer (corresponding to our two features), 4 neurons in the hidden layer, and 1 neuron in the output layer (because this is a binary classification problem).
We train the network for 10000 epochs with a learning rate of 0.1. Then, we test our network with a new data point.
Summary
Congratulations! You've just built a simple neural network from scratch. We implemented forward propagation, backward propagation, and gradient descent using NumPy, which are all fundamental concepts in deep learning.
Although this example is simple, it demonstrates the core principles of neural networks. In real applications, we usually use more complex architectures and more advanced optimization techniques, but the basic ideas are the same.
How do you find this process? Is it simpler (or more complex) than you imagined? Remember, even the most complex neural networks are built upon these basic concepts. Keep exploring, keep learning, and you'll find that the world of neural networks is full of endless possibilities!
Next, you can try adjusting the network's parameters, such as changing the size of the hidden layer or the learning rate, and see what differences it makes. Or, you can try using this network to solve other simple problems. Each attempt will give you a deeper understanding of neural networks.
Happy coding, and keep your curiosity alive!