From 50aeba2e675d930769850f247aa9110426b9d834 Mon Sep 17 00:00:00 2001 From: kaksat Date: Mon, 6 Aug 2018 20:21:35 +0200 Subject: [PATCH] Correction of a formula for silhouette coefficient Source: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html --- 08_dimensionality_reduction.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/08_dimensionality_reduction.ipynb b/08_dimensionality_reduction.ipynb index fc665c9..975d83b 100644 --- a/08_dimensionality_reduction.ipynb +++ b/08_dimensionality_reduction.ipynb @@ -2606,7 +2606,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Another approach is to look at the _silhouette score_, which is the mean _silhouette coefficient_ over all the instances. An instance's silhouette coefficient is equal to $(a - b)/\\max(a, b)$ where $a$ is the mean distance to the other instances in the same cluster (it is the _mean intra-cluster distance_), and $b$ is the _mean nearest-cluster distance_, that is the mean distance to the instances of the next closest cluster (defined as the one that minimizes $b$, excluding the instance's own cluster). The silhouette coefficient can vary between -1 and +1: a coefficient close to +1 means that the instance is well inside its own cluster and far from other clusters, while a coefficient close to 0 means that it is close to a cluster boundary, and finally a coefficient close to -1 means that the instance may have been assigned to the wrong cluster." + "Another approach is to look at the _silhouette score_, which is the mean _silhouette coefficient_ over all the instances. An instance's silhouette coefficient is equal to $(b - a)/\\max(a, b)$ where $a$ is the mean distance to the other instances in the same cluster (it is the _mean intra-cluster distance_), and $b$ is the _mean nearest-cluster distance_, that is the mean distance to the instances of the next closest cluster (defined as the one that minimizes $b$, excluding the instance's own cluster). The silhouette coefficient can vary between -1 and +1: a coefficient close to +1 means that the instance is well inside its own cluster and far from other clusters, while a coefficient close to 0 means that it is close to a cluster boundary, and finally a coefficient close to -1 means that the instance may have been assigned to the wrong cluster." ] }, {