Fix typos and remove unused args in plot.grid()
parent
b2103e953a
commit
f32ce273d2
|
@ -11,7 +11,7 @@
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"_This notebook is an extra chapter on Support Vector Machines. It also includes exercises and their solutions at the end._"
|
"_This notebook contains all the sample code and solutions to the exercises in chapter 5._"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -540,7 +540,7 @@
|
||||||
"plt.figure(figsize=(10, 3))\n",
|
"plt.figure(figsize=(10, 3))\n",
|
||||||
"\n",
|
"\n",
|
||||||
"plt.subplot(121)\n",
|
"plt.subplot(121)\n",
|
||||||
"plt.grid(True, which='both')\n",
|
"plt.grid(True)\n",
|
||||||
"plt.axhline(y=0, color='k')\n",
|
"plt.axhline(y=0, color='k')\n",
|
||||||
"plt.plot(X1D[:, 0][y==0], np.zeros(4), \"bs\")\n",
|
"plt.plot(X1D[:, 0][y==0], np.zeros(4), \"bs\")\n",
|
||||||
"plt.plot(X1D[:, 0][y==1], np.zeros(5), \"g^\")\n",
|
"plt.plot(X1D[:, 0][y==1], np.zeros(5), \"g^\")\n",
|
||||||
|
@ -549,7 +549,7 @@
|
||||||
"plt.axis([-4.5, 4.5, -0.2, 0.2])\n",
|
"plt.axis([-4.5, 4.5, -0.2, 0.2])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"plt.subplot(122)\n",
|
"plt.subplot(122)\n",
|
||||||
"plt.grid(True, which='both')\n",
|
"plt.grid(True)\n",
|
||||||
"plt.axhline(y=0, color='k')\n",
|
"plt.axhline(y=0, color='k')\n",
|
||||||
"plt.axvline(x=0, color='k')\n",
|
"plt.axvline(x=0, color='k')\n",
|
||||||
"plt.plot(X2D[:, 0][y==0], X2D[:, 1][y==0], \"bs\")\n",
|
"plt.plot(X2D[:, 0][y==0], X2D[:, 1][y==0], \"bs\")\n",
|
||||||
|
@ -624,7 +624,7 @@
|
||||||
" plt.plot(X[:, 0][y==0], X[:, 1][y==0], \"bs\")\n",
|
" plt.plot(X[:, 0][y==0], X[:, 1][y==0], \"bs\")\n",
|
||||||
" plt.plot(X[:, 0][y==1], X[:, 1][y==1], \"g^\")\n",
|
" plt.plot(X[:, 0][y==1], X[:, 1][y==1], \"g^\")\n",
|
||||||
" plt.axis(axes)\n",
|
" plt.axis(axes)\n",
|
||||||
" plt.grid(True, which='both')\n",
|
" plt.grid(True)\n",
|
||||||
" plt.xlabel(\"$x_1$\")\n",
|
" plt.xlabel(\"$x_1$\")\n",
|
||||||
" plt.ylabel(\"$x_2$\", rotation=0)\n",
|
" plt.ylabel(\"$x_2$\", rotation=0)\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
@ -766,7 +766,7 @@
|
||||||
"plt.figure(figsize=(10.5, 4))\n",
|
"plt.figure(figsize=(10.5, 4))\n",
|
||||||
"\n",
|
"\n",
|
||||||
"plt.subplot(121)\n",
|
"plt.subplot(121)\n",
|
||||||
"plt.grid(True, which='both')\n",
|
"plt.grid(True)\n",
|
||||||
"plt.axhline(y=0, color='k')\n",
|
"plt.axhline(y=0, color='k')\n",
|
||||||
"plt.scatter(x=[-2, 1], y=[0, 0], s=150, alpha=0.5, c=\"red\")\n",
|
"plt.scatter(x=[-2, 1], y=[0, 0], s=150, alpha=0.5, c=\"red\")\n",
|
||||||
"plt.plot(X1D[:, 0][yk==0], np.zeros(4), \"bs\")\n",
|
"plt.plot(X1D[:, 0][yk==0], np.zeros(4), \"bs\")\n",
|
||||||
|
@ -789,7 +789,7 @@
|
||||||
"plt.axis([-4.5, 4.5, -0.1, 1.1])\n",
|
"plt.axis([-4.5, 4.5, -0.1, 1.1])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"plt.subplot(122)\n",
|
"plt.subplot(122)\n",
|
||||||
"plt.grid(True, which='both')\n",
|
"plt.grid(True)\n",
|
||||||
"plt.axhline(y=0, color='k')\n",
|
"plt.axhline(y=0, color='k')\n",
|
||||||
"plt.axvline(x=0, color='k')\n",
|
"plt.axvline(x=0, color='k')\n",
|
||||||
"plt.plot(XK[:, 0][yk==0], XK[:, 1][yk==0], \"bs\")\n",
|
"plt.plot(XK[:, 0][yk==0], XK[:, 1][yk==0], \"bs\")\n",
|
||||||
|
@ -1185,7 +1185,7 @@
|
||||||
" axs, (hinge_pos, hinge_pos ** 2), (hinge_neg, hinge_neg ** 2), titles):\n",
|
" axs, (hinge_pos, hinge_pos ** 2), (hinge_neg, hinge_neg ** 2), titles):\n",
|
||||||
" ax.plot(s, loss_pos, \"g-\", linewidth=2, zorder=10, label=\"$t=1$\")\n",
|
" ax.plot(s, loss_pos, \"g-\", linewidth=2, zorder=10, label=\"$t=1$\")\n",
|
||||||
" ax.plot(s, loss_neg, \"r--\", linewidth=2, zorder=10, label=\"$t=-1$\")\n",
|
" ax.plot(s, loss_neg, \"r--\", linewidth=2, zorder=10, label=\"$t=-1$\")\n",
|
||||||
" ax.grid(True, which='both')\n",
|
" ax.grid(True)\n",
|
||||||
" ax.axhline(y=0, color='k')\n",
|
" ax.axhline(y=0, color='k')\n",
|
||||||
" ax.axvline(x=0, color='k')\n",
|
" ax.axvline(x=0, color='k')\n",
|
||||||
" ax.set_xlabel(r\"$s = \\mathbf{w}^\\intercal \\mathbf{x} + b$\")\n",
|
" ax.set_xlabel(r\"$s = \\mathbf{w}^\\intercal \\mathbf{x} + b$\")\n",
|
||||||
|
@ -1250,7 +1250,6 @@
|
||||||
" w = np.random.randn(X.shape[1], 1) # n feature weights\n",
|
" w = np.random.randn(X.shape[1], 1) # n feature weights\n",
|
||||||
" b = 0\n",
|
" b = 0\n",
|
||||||
"\n",
|
"\n",
|
||||||
" m = len(X)\n",
|
|
||||||
" t = np.array(y, dtype=np.float64).reshape(-1, 1) * 2 - 1\n",
|
" t = np.array(y, dtype=np.float64).reshape(-1, 1) * 2 - 1\n",
|
||||||
" X_t = X * t\n",
|
" X_t = X * t\n",
|
||||||
" self.Js = []\n",
|
" self.Js = []\n",
|
||||||
|
@ -1492,7 +1491,7 @@
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"1. The fundamental idea behind Support Vector Machines is to fit the widest possible \"street\" between the classes. In other words, the goal is to have the largest possible margin between the decision boundary that separates the two classes and the training instances. When performing soft margin classification, the SVM searches for a compromise between perfectly separating the two classes and having the widest possible street (i.e., a few instances may end up on the street). Another key idea is to use kernels when training on nonlinear datasets. SVMs can also be tweaked to perform linear and nonlinear regression, as well as novelty detection.\n",
|
"1. The fundamental idea behind Support Vector Machines is to fit the widest possible \"street\" between the classes. In other words, the goal is to have the largest possible margin between the decision boundary that separates the two classes of the training instances. When performing soft margin classification, the SVM searches for a compromise between perfectly separating the two classes and having the widest possible street (i.e., a few instances may end up on the street). Another key idea is to use kernels when training on nonlinear datasets. SVMs can also be tweaked to perform linear and nonlinear regression, as well as novelty detection.\n",
|
||||||
"2. After training an SVM, a _support vector_ is any instance located on the \"street\" (see the previous answer), including its border. The decision boundary is entirely determined by the support vectors. Any instance that is _not_ a support vector (i.e., is off the street) has no influence whatsoever; you could remove them, add more instances, or move them around, and as long as they stay off the street they won't affect the decision boundary. Computing the predictions with a kernelized SVM only involves the support vectors, not the whole training set.\n",
|
"2. After training an SVM, a _support vector_ is any instance located on the \"street\" (see the previous answer), including its border. The decision boundary is entirely determined by the support vectors. Any instance that is _not_ a support vector (i.e., is off the street) has no influence whatsoever; you could remove them, add more instances, or move them around, and as long as they stay off the street they won't affect the decision boundary. Computing the predictions with a kernelized SVM only involves the support vectors, not the whole training set.\n",
|
||||||
"3. SVMs try to fit the largest possible \"street\" between the classes (see the first answer), so if the training set is not scaled, the SVM will tend to neglect small features (see Figure 5–2).\n",
|
"3. SVMs try to fit the largest possible \"street\" between the classes (see the first answer), so if the training set is not scaled, the SVM will tend to neglect small features (see Figure 5–2).\n",
|
||||||
"4. You can use the `decision_function()` method to get confidence scores. These scores represent the distance between the instance and the decision boundary. However, they cannot be directly converted into an estimation of the class probability. If you set `probability=True` when creating an `SVC`, then at the end of training it will use 5-fold cross-validation to generate out-of-sample scores for the training samples, and it will train a `LogisticRegression` model to map these scores to estimated probabilities. The `predict_proba()` and `predict_log_proba()` methods will then be available.\n",
|
"4. You can use the `decision_function()` method to get confidence scores. These scores represent the distance between the instance and the decision boundary. However, they cannot be directly converted into an estimation of the class probability. If you set `probability=True` when creating an `SVC`, then at the end of training it will use 5-fold cross-validation to generate out-of-sample scores for the training samples, and it will train a `LogisticRegression` model to map these scores to estimated probabilities. The `predict_proba()` and `predict_log_proba()` methods will then be available.\n",
|
||||||
|
@ -2249,7 +2248,7 @@
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"This tuned kernelized SVM performs better than the `LinearSVC` model, but we get a lower score on the test set than we measured using cross-validation. This is quite common: since we did so much hyperparameter tuning, we ended up slightly overfitting the cross-validation test sets. It's tempting to tweak the hyperparameters a bit more until we get a better result on the test set, but we this would probably not help, as we would just start overfitting the test set. Anyway, this score is not bad at all, so let's stop here."
|
"This tuned kernelized SVM performs better than the `LinearSVC` model, but we get a lower score on the test set than we measured using cross-validation. This is quite common: since we did so much hyperparameter tuning, we ended up slightly overfitting the cross-validation test sets. It's tempting to tweak the hyperparameters a bit more until we get a better result on the test set, but this would probably not help, as we would just start overfitting the test set. Anyway, this score is not bad at all, so let's stop here."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
@ -2309,7 +2308,7 @@
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Don't forget to scale the data:"
|
"Don't forget to scale the data!"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
Loading…
Reference in New Issue