In artificial neural network, an activation function is a function that maps information from neurons in one layer to information in the next layer.

Why do we need non-linear activation function?

  • introduce non-linearity into the model, which allows neural networks to approximate complex, non-linear relationships within data. Without non-linear activation functions, neural networks would be limited to modeling only linear transformations, making them less effective at solving a wide range of real-world problems. See how linear activation function translates to simple regression.
  • facilitate the backpropagation algorithm. Without non-linear activations, the entire network would collapse into a linear transformation, preventing meaningful learning.
  • Some non-linear activation functions, like ReLU, can help create sparse representation
  • introduce noise and constraints that prevent the network from overfitting

How to choose activation function?

(ML Specialization)

Depending on the output layer