(ML Specialization) (Code) ML Specialization regularization is a technique to penalize complexity of the models to prevent overfitting, by reducing the value/impact of parameters. This means it introduces more bias and lowers variance.

overfitting often comes from having too many parameters. There are generally two types of regularization: reducing the impact of parameters (Ridge) or zeroing the parameters (Lasso)

Types

(Regularization on Regression) (Interactive Visualizer)

Cost function with regularization

regularization is added directly to the cost function

For example, mean squared error (cost of linear regression) with Ridge regularization:

  • : a vector of weights, a scalar-valued bias term
  • : number of training examples
  • : learning algorithm to fit a vector of features
  • : training example in the dataset
  • : regularization parameter. How much we want to shrink the impact of some predictors. The larger the more penalty.
  • : number of features
  • : weight parameter (to be penalized)

In practice, we might or might not penalize parameter , because it makes little difference. It’s just a number.

How to choose

(Source)

  1. Try different values for , each doubling the previous:
    1. Minimize cost function with regularization, as we do normal cost function.
    2. Evaluate parameters on the validation set
  2. Pick parameters that has the lowest validation error
  3. Report test error (or cross validation error as an estimate)