clustering is an unsupervised learning algorithm that automatically groups similar data points together into homogeneous classes/clusters.

Application

  • Group similar news
  • Market segmentation
  • Analyze DNA into groups
  • Group astronomical data

Characteristics of an effective clustering model

  • The clusters are clearly identifiable.
  • Within each intercluster, there is lots of empty space.
  • Within each intracluster, the points are close to each other.

Metrics

For an effective model, we want to

Inertia

Transclude of inertia

Silhouette score

Transclude of silhouette-score