artificial neural network is a group of interconnected neuron that send information (from inputs to outputs).
Architecture
Multilayer perceptrons
Example
Demand prediction
Face recognition
- Input layer: the image dataset, broken down to pixels. These pixels
make up the vector of dimension - In the first hidden layer, each neuron might look for line segments and pass down to the next layer
- In the second hidden layer, each neuron learns to group results from the previous layer, and looks for higher-level features (activations) like eyes, noses (their shapes, positions, etc)
- Neurons in the last hidden layer may look at face shape, angle, position, etc
- Finally, the artificial neural network outputs the probability of the image being person X
Type of Layers (for 1 training example)
(ML Spec) (Notation) (More examples)
Dense Layer
The neuron
For
For
- 2nd neuron in layer 1 is denoted
: number of neurons in the current layer : activation value : a vector of activation values : matrix of weights (strength of activation value ) : vector of bias, deciding when to activate meaningfully the weighted sum of weights (image ⬇️). But actually, bias is a scalar-valued term that is repeated times to match the dimension of : activation function for layer (could be any learning algorithm, but some popular ones are ReLU, tanh)
Work out dimensions of the dot product for one training example
is the weighted sum of the inputs , before passing through an activation function
Credit: 3Blue1Brown
It is called dense layer because the features input are now densely connected into the neurons. We cannot distinguish original neurons anymore.
Convolutional Layer
Each neuron looks only at PART of inputs from previous layer. See convolutional neural network
Why?
- More efficient computation
- Less prone to overfitting training set
Recurrent Layer
Usage
- Works well on all types of data, including structured and unstructured data (although there are faster learning algorithm on structured data)
- Works with transfer learning
- Easier to string together multiple neural networks, so we can train them all at the same time using gradient descent
Types of Neural Network
Based on number of hidden layers
- shallow neural network: very few (usually < 3). A network with one hidden layer is the single-layer perceptron
- deep neural work: important building block of deep learning. They can learn more diverse and complex patterns from data.
Based on structure - feedforward neural network - structured / tabular data
- convolutional neural network - image (mostly), text, audio
- recurrent neural network - sequential data such as text, time series
- graph neural network - graphs
Process
Reference from machine learning development process
- Get data
- Build a neural network
- Type of Layers
- Number of neurons in different layers
- activation function
- Train a neural network
- Specify loss
and, subsequently, cost function - Select optimization techniques, such as ADAM optimzer
- Fit the model with training data
- Specify loss
- Minimize the average cost across data using backpropagation: We take small steps in the direction of steepest descent