A confusion matrix is a graphical representation of how accurate a classifier is at predicting the labels for a categorical variable.

True Positive:
True Negative:
False Positive = Type I error
False Negative = Type II error

Usage

Compare metrics mentioned between different types of ML algorithms
The dimension of confusion matrix depend on the number of wanted categories of prediction. For example, if we want to predict whether a student is sleep-deprived, or sleeping enough or oversleeping, the matrix could be 3x3.

Type of metrics

Accuracy

Transclude of accuracy

Precision

precision
precision is the proportion of TRUE positives over all PREDICTED positives
Link to original

Recall / Sensitivity

sensitivity
recall, or sensitivity, as a classification metric, is the proportion of correctly predicted observations in one class out of all observations in that class. Or the ratio of TRUE positives out of all ACTUAL positives

This has a formula opposite of specificity

Transclude of specificity

Usage

recall is important when we believe False Negatives are more important than False Positives (e.g. problem of cancer detection).

Out of survived passengers, how many did we label correctly?

Out of the sick patients, how many did we correctly diagnose as sick?

Link to original

Specificity

Transclude of specificity

F-beta

F-beta
score is the weighted harmonic mean of precision and recall. It’s a more generalized version of F1 score

is a factor that determines how many times more important Recall is than precision in the score.

If is close to 0, it skews towards precision

If , it is F1 score

If (a large number), it skews towards sensitivity

Precision-Recall Tradeoff

precision = the proportion of true positives out of all positive predictions = TP / (TP + FP) Recall = the proportion of true positives out of all actual positives = TP / (TP + FN)

If False Negative is worse than False Positive (minimize Type II error), the model requires a high Recall, so that it could catch as many positive cases as possible. We should tailor our F-beta to incline towards Recall ()

Diagnosing sick patients

Detecting malfunctioning parts in a spaceship

If False Positive is worse than False Negative (minimize Type I error), the model requires a high Recall. We should tailor our F-beta to incline towards precision ()

Sending promotional material in the mail to potential clients (we don’t want to send to many people that won’t be interested)

Link to original

My (Chiffon) Nguyen

Explorer

confusion matrix

Usage

Type of metrics

Accuracy

Precision

precision

Recall / Sensitivity

sensitivity

Usage

Specificity

F-beta

F-beta

Precision-Recall Tradeoff

Graph View

Table of Contents

Backlinks