(StatsQuest) (ML Spec) ML Specialization
Bias-Variance Tradeoff refers to the relationship between bias and variance. As we increase the complexity of a model, its variance tends to increase and its bias tends to decrease. Conversely, if we simplify the model, its variance decreases but its bias increases.
It is also called a dilemma because we don’t have a clear objective function to optimize for.
Summary
- high bias → underfitting (bias ~ residual)
- high cost function on training set (high training error)
- high variance → overfitting (variance ~ model complexity)
- low cost function on training set (low training error)
- high cost function on cross-validation set
Foundational concepts
- Bias refers to the error introduced by approximating a real-world problem (which may be complex) by a simplified model. High bias causes a learning algorithm to miss the relevant relationships between features and target outputs (underfitting).
- Variance refers to the error introduced by the model’s sensitivity to small fluctuations or noise in the training data. When a model has high variance, it closely fits the training data (overfitting), but may make poor predictions on testing data
The goal is to find the right level of model complexity that minimizes the overall prediction error on unseen data. In other words, find a sweet spot between bias and variance. This is achieved by choosing an appropriate model and tuning its hyperparameters, through techniques such as cross validation and regularization
Fixes
Fix high bias
Approach
Make the learning algorithm more powerful/complex to deal with training data
- Try getting additional features
- Try adding polynomial features (e.g.
) - Try decreasing regularization parameter
(less penalty for model complexity) - Use a bigger artificial neural network with proper regularization
Note that these approaches usually come up with higher computational cost,
Link to original
Fix high variance
Approach
Link to original
- Collect more data. However, getting more data is often hard (e.g. inaccessible, limited, expensive) or impossible
- Simplify the model
- Select a smaller set of relevant features
- Reduce size of parameters by applying regularization