Fix typos and deprecated shift() call in ch 3

main
Victor Khaustov 2022-05-11 18:39:45 +09:00
parent ec67af216b
commit a3bbf0716b
1 changed files with 22 additions and 12 deletions

View File

@ -498,7 +498,7 @@
"from sklearn.model_selection import StratifiedKFold\n", "from sklearn.model_selection import StratifiedKFold\n",
"from sklearn.base import clone\n", "from sklearn.base import clone\n",
"\n", "\n",
"skfolds = StratifiedKFold(n_splits=3) # add shuffle=True is the dataset is not\n", "skfolds = StratifiedKFold(n_splits=3) # add shuffle=True if the dataset is not\n",
" # already shuffled\n", " # already shuffled\n",
"for train_index, test_index in skfolds.split(X_train, y_train_5):\n", "for train_index, test_index in skfolds.split(X_train, y_train_5):\n",
" clone_clf = clone(sgd_clf)\n", " clone_clf = clone(sgd_clf)\n",
@ -1608,7 +1608,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"**Warning:** the following two cells make take a few minutes each to run:" "**Warning:** the following two cells may take a few minutes each to run:"
] ]
}, },
{ {
@ -1950,7 +1950,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"**Warning**: the following cell may take a few minutes:" "**Warning**: the following cell may take a few minutes to run:"
] ]
}, },
{ {
@ -2177,7 +2177,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Let's see if we tuning the hyperparameters can help. To speed up the search, let's train only on the first 10,000 images:" "Let's see if tuning the hyperparameters can help. To speed up the search, let's train only on the first 10,000 images:"
] ]
}, },
{ {
@ -2295,7 +2295,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Exercise: _Write a function that can shift an MNIST image in any direction (left, right, up, or down) by one pixel. You can use the `shift()` function from the `scipy.ndimage.interpolation` module. For example, `shift(image, [2, 1], cval=0)` shifts the image two pixels down and one pixel to the right. Then, for each image in the training set, create four shifted copies (one per direction) and add them to the training set. Finally, train your best model on this expanded training set and measure its accuracy on the test set. You should observe that your model performs even better now! This technique of artificially growing the training set is called _data augmentation_ or _training set expansion_._" "Exercise: _Write a function that can shift an MNIST image in any direction (left, right, up, or down) by one pixel. You can use the `shift()` function from the `scipy.ndimage` module. For example, `shift(image, [2, 1], cval=0)` shifts the image two pixels down and one pixel to the right. Then, for each image in the training set, create four shifted copies (one per direction) and add them to the training set. Finally, train your best model on this expanded training set and measure its accuracy on the test set. You should observe that your model performs even better now! This technique of artificially growing the training set is called _data augmentation_ or _training set expansion_._"
] ]
}, },
{ {
@ -2311,7 +2311,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from scipy.ndimage.interpolation import shift" "from scipy.ndimage import shift"
] ]
}, },
{ {
@ -2455,23 +2455,33 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"**Warning**: the following cell may take a few minutes to run." "**Warning**: the following cell may take a few minutes to run:"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 101, "execution_count": 101,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [
{
"data": {
"text/plain": "0.9763"
},
"execution_count": 101,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [ "source": [
"augmented_accuracy = knn_clf.score(X_test, y_test)" "augmented_accuracy = knn_clf.score(X_test, y_test)\n",
"augmented_accuracy"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"By simply augmenting the data, we got a 0.5% accuracy boost. Perhaps this does not sound so impressive, but this actually means that the error rate dropped significantly:" "By simply augmenting the data, we've got a 0.5% accuracy boost. Perhaps it does not sound so impressive, but it actually means that the error rate dropped significantly:"
] ]
}, },
{ {
@ -2558,7 +2568,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"The data is already split into a training set and a test set. However, the test data does *not* contain the labels: your goal is to train the best model you can using the training data, then make your predictions on the test data and upload them to Kaggle to see your final score." "The data is already split into a training set and a test set. However, the test data does *not* contain the labels: your goal is to train the best model you can on the training data, then make your predictions on the test data and upload them to Kaggle to see your final score."
] ]
}, },
{ {
@ -3275,7 +3285,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"And now we could just build a CSV file with these predictions (respecting the format excepted by Kaggle), then upload it and hope for the best. But wait! We can do better than hope. Why don't we use cross-validation to have an idea of how good our model is?" "And now we could just build a CSV file with these predictions (respecting the format expected by Kaggle), then upload it and hope for the best. But wait! We can do better than hope. Why don't we use cross-validation to have an idea of how good our model is?"
] ]
}, },
{ {