Add some section headers

2021-10-03 00:14:44 +13:00 · 2021-10-03 00:14:44 +13:00 · 6b821335c0
parent 2bd68d6348
commit 6b821335c0
3 changed files with 239 additions and 26 deletions
--- a/02_end_to_end_machine_learning_project.ipynb
+++ b/02_end_to_end_machine_learning_project.ipynb
@ -83,7 +83,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Get the data"
+    "# Get the Data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download the Data"
   ]
  },
  {
@ -132,6 +139,13 @@
    "    return pd.read_csv(csv_path)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Take a Quick Look at the Data Structure"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 5,
@ -182,6 +196,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create a Test Set"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 10,
@ -443,7 +464,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Discover and visualize the data to gain insights"
+    "# Discover and Visualize the Data to Gain Insights"
   ]
  },
  {
@ -455,6 +476,13 @@
    "housing = strat_train_set.copy()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Visualizing Geographical Data"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 33,
@ -540,6 +568,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Looking for Correlations"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 38,
@ -585,6 +620,13 @@
    "save_fig(\"income_vs_house_value_scatterplot\")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Experimenting with Attribute Combinations"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 42,
@ -631,7 +673,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Prepare the data for Machine Learning algorithms"
+    "# Prepare the Data for Machine Learning Algorithms"
   ]
  },
  {
@ -644,6 +686,29 @@
    "housing_labels = strat_train_set[\"median_house_value\"].copy()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Data Cleaning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the book 3 options are listed:\n",
+    "\n",
+    "```python\n",
+    "housing.dropna(subset=[\"total_bedrooms\"])    # option 1\n",
+    "housing.drop(\"total_bedrooms\", axis=1)       # option 2\n",
+    "median = housing[\"total_bedrooms\"].median()  # option 3\n",
+    "housing[\"total_bedrooms\"].fillna(median, inplace=True)\n",
+    "```\n",
+    "\n",
+    "To demonstrate each of them, let's create a copy of the housing dataset, but keeping only the rows that contain at least one null. Then it will be easier to visualize exactly what each option does:"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 47,
@ -815,6 +880,13 @@
    "housing_tr.head()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Handling Text and Categorical Attributes"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -910,6 +982,13 @@
    "cat_encoder.categories_"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Custom Transformers"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -985,6 +1064,13 @@
    "housing_extra_attribs.head()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Transformation Pipelines"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -1154,7 +1240,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Select and train a model "
+    "# Select and Train a Model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Training and Evaluating on the Training Set"
   ]
  },
  {
@ -1269,7 +1362,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Fine-tune your model"
+    "## Better Evaluation Using Cross-Validation"
   ]
  },
  {
@ -1382,6 +1475,20 @@
    "svm_rmse"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Fine-Tune Your Model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Grid Search"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 99,
@ -1457,6 +1564,13 @@
    "pd.DataFrame(grid_search.cv_results_)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Randomized Search"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 104,
@ -1488,6 +1602,13 @@
    "    print(np.sqrt(-mean_score), params)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Analyze the Best Models and Their Errors"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 106,
@ -1512,6 +1633,13 @@
    "sorted(zip(feature_importances, attributes), reverse=True)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Evaluate Your System on the Test Set"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 108,
--- a/03_classification.ipynb
+++ b/03_classification.ipynb
@ -245,7 +245,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Binary classifier"
+    "# Training a Binary Classifier"
   ]
  },
  {
@ -296,6 +296,20 @@
    "cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring=\"accuracy\")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Performance Measures"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Measuring Accuracy Using Cross-Validation"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 18,
@ -362,6 +376,13 @@
    "* lastly, other things may prevent perfect reproducibility, such as Python dicts and sets whose order is not guaranteed to be stable across sessions, or the order of files in a directory which is also not guaranteed."
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Confusion Matrix"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 21,
@ -394,6 +415,13 @@
    "confusion_matrix(y_train_5, y_train_perfect_predictions)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Precision and Recall"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 24,
@ -453,6 +481,13 @@
    "cm[1, 1] / (cm[1, 1] + (cm[1, 0] + cm[0, 1]) / 2)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Precision/Recall Trade-off"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 30,
@ -625,7 +660,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# ROC curves"
+    "## The ROC Curve"
   ]
  },
  {
@ -757,7 +792,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Multiclass classification"
+    "# Multiclass Classification"
   ]
  },
  {
@ -882,7 +917,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Error analysis"
+    "# Error Analysis"
   ]
  },
  {
@ -969,7 +1004,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Multilabel classification"
+    "# Multilabel Classification"
   ]
  },
  {
@ -1018,7 +1053,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Multioutput classification"
+    "# Multioutput Classification"
   ]
  },
  {
--- a/04_training_linear_models.ipynb
+++ b/04_training_linear_models.ipynb
@ -4,7 +4,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "**Chapter 4 – Training Linear Models**"
+    "**Chapter 4 – Training Models**"
   ]
  },
  {
@ -89,7 +89,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Linear regression using the Normal Equation"
+    "# Linear Regression"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## The Normal Equation"
   ]
  },
  {
@ -243,7 +250,8 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Linear regression using batch gradient descent"
+    "# Gradient Descent\n",
+    "## Batch Gradient Descent"
   ]
  },
  {
@ -330,7 +338,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Stochastic Gradient Descent"
+    "## Stochastic Gradient Descent"
   ]
  },
  {
@ -416,7 +424,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Mini-batch gradient descent"
+    "## Mini-batch gradient descent"
   ]
  },
  {
@ -494,7 +502,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Polynomial regression"
+    "# Polynomial Regression"
   ]
  },
  {
@ -616,6 +624,13 @@
    "plt.show()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Learning Curves"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 35,
@ -678,7 +693,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Regularized models"
+    "# Regularized Linear Models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Ridge Regression"
   ]
  },
  {
@ -772,6 +794,13 @@
    "sgd_reg.predict([[1.5]])"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Lasso Regression"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 43,
@ -803,6 +832,13 @@
    "lasso_reg.predict([[1.5]])"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Elastic Net"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 45,
@ -815,6 +851,13 @@
    "elastic_net.predict([[1.5]])"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Early Stopping"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 46,
@ -829,13 +872,6 @@
    "X_train, X_val, y_train, y_val = train_test_split(X[:50], y[:50].ravel(), test_size=0.5, random_state=10)"
   ]
  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Early stopping example:"
-   ]
-  },
  {
   "cell_type": "code",
   "execution_count": 47,
@ -1029,7 +1065,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Logistic regression"
+    "# Logistic Regression"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Decision Boundaries"
   ]
  },
  {
@ -1166,6 +1209,13 @@
    "log_reg.predict([[1.7], [1.5]])"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Softmax Regression"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 62,