For the past week or so I have been trying to work through Assignment 2 in Stanford's cs231n course on Neural Networks. The assignment focuses on the following aspects:
- Understanding Neural Networks and how they are arranged.
- Understanding and being able to implement backpropagation.
- Implementing various update rules used to optimize Neural Networks.
- Implementing Batch Normalization and Layer Normalization for training deep networks.
- Implementing Dropout to regularize networks.
- Understanding the architecture of Convolutional Neural Networks and getting practice with training these models on data.
- Gaining experience with a major deep learning framework, such as TensorFlow or PyTorch.
I had worked on the last part in the previous blog post. This time, however, I went into exploring how different layers are formed in a Neural Network when it is, in fact, built from scratch. In short, I explored ways to implement:
- a forward pass
- a backward pass
- a forward pass for a ReLU activation function
- a backward pass for a ReLU activation function
Any pass in the network allows to move along the layers. The forward pass of a fully-connected layer corresponds to one matrix multiplication followed by a bias offset and an activation function. The backward pass moves back along the network to calculate the changes in weights.
It is interesting how the instructors chose ReLU activation function considering how many of them exist. I, however, have seen ReLU used a lot, and it seems to have proven to produce better results than others. On the other hand, it can also be quite fragile. Thankfully implementing ReLU simply involves calculating the following: f(x)=max(0,x), which means that activation is simply thresholded at zero.
I am still working my way through the assignment and will continue making posts as I make more progress.