Add explanations for the first convolutional layer example

2021-07-01 13:20:00 +12:00 · 2021-07-01 13:20:00 +12:00 · 341d8fe792
parent 3dd82863d1
commit 341d8fe792
1 changed files with 149 additions and 67 deletions
--- a/14_deep_computer_vision_with_cnns.ipynb
+++ b/14_deep_computer_vision_with_cnns.ipynb
@ -221,36 +221,118 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Using `keras.layers.Conv2D()`:"
+    "Let's create a 2D convolutional layer, using `keras.layers.Conv2D()`:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
-    "conv = keras.layers.Conv2D(filters=32, kernel_size=3, strides=1,\n",
+    "np.random.seed(42)\n",
+    "tf.random.set_seed(42)\n",
+    "\n",
+    "conv = keras.layers.Conv2D(filters=2, kernel_size=7, strides=1,\n",
    "                           padding=\"SAME\", activation=\"relu\", input_shape=outputs.shape)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's call this layer, passing it the two test images:"
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
-    "conv_outputs = conv(outputs)\n",
+    "conv_outputs = conv(images)\n",
+    "conv_outputs.shape "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The output is a 4D tensor. The dimensions are: batch size, height, width, channels. The first dimension (batch size) is 2 since there are 2 input images. The next two dimensions are the height and width of the output feature maps: since `padding=\"SAME\"` and `strides=1`, the output feature maps have the same height and width as the input images (in this case, 427×640). Lastly, this convolutional layer has 2 filters, so the last dimension is 2: there are 2 output feature maps per input image."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Since the filters are initialized randomly, they'll initially detect random patterns. Let's take a look at the 2 output features maps for each image:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plt.figure(figsize=(10,6))\n",
+    "for image_index in (0, 1):\n",
+    "    for feature_map_index in (0, 1):\n",
+    "        plt.subplot(2, 2, image_index * 2 + feature_map_index + 1)\n",
+    "        plot_image(crop(conv_outputs[image_index, :, :, feature_map_index]))\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Although the filters were initialized randomly, the second filter happens to act like an edge detector. Randomly initialized filters often act this way, which is quite fortunate since detecting edges is quite useful in image processing."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If we want, we can set the filters to be the ones we manually defined earlier, and set the biases to zeros (in real life we will almost never need to set filters or biases manually, as the convolutional layer will just learn the appropriate filters and biases during training):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "conv.set_weights([filters, np.zeros(2)])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's call this layer again on the same two images, and let's check that the output feature maps do highlight vertical lines and horizontal lines, respectively (as earlier):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "conv_outputs = conv(images)\n",
    "conv_outputs.shape "
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
-    "plot_image(conv_outputs[0, :, :, 0])\n",
+    "plt.figure(figsize=(10,6))\n",
+    "for image_index in (0, 1):\n",
+    "    for feature_map_index in (0, 1):\n",
+    "        plt.subplot(2, 2, image_index * 2 + feature_map_index + 1)\n",
+    "        plot_image(crop(conv_outputs[image_index, :, :, feature_map_index]))\n",
    "plt.show()"
   ]
  },
@ -263,7 +345,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
@ -276,7 +358,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
@ -289,7 +371,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
@ -314,7 +396,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
@ -353,7 +435,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
@ -362,7 +444,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
@ -372,7 +454,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
@ -400,7 +482,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
@ -421,7 +503,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
@ -440,7 +522,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
@ -453,7 +535,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
@ -477,7 +559,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
@ -486,7 +568,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
@ -495,7 +577,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
@ -522,7 +604,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
@ -532,7 +614,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
@ -549,7 +631,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
@ -570,7 +652,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
@ -599,7 +681,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 28,
+   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
@ -619,7 +701,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 29,
+   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
@ -654,7 +736,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 30,
+   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
@ -676,7 +758,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 31,
+   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
@ -692,7 +774,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
@ -701,7 +783,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 33,
+   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
@ -712,7 +794,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 34,
+   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
@ -722,7 +804,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 35,
+   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
@ -733,7 +815,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 36,
+   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
@ -748,7 +830,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 37,
+   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
@ -758,7 +840,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 38,
+   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
@ -767,7 +849,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 39,
+   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
@ -788,7 +870,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 40,
+   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
@ -799,7 +881,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 41,
+   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
@ -808,7 +890,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 42,
+   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
@ -817,7 +899,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 43,
+   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
@ -827,7 +909,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 44,
+   "execution_count": 48,
   "metadata": {},
   "outputs": [],
   "source": [
@ -836,7 +918,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 45,
+   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
@ -853,7 +935,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 46,
+   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
@ -865,7 +947,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 47,
+   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
@ -890,7 +972,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 48,
+   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
@ -909,7 +991,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 49,
+   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
@ -946,7 +1028,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 50,
+   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
@ -963,7 +1045,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 51,
+   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
@ -980,7 +1062,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 52,
+   "execution_count": 56,
   "metadata": {},
   "outputs": [],
   "source": [
@ -993,7 +1075,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 53,
+   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1003,7 +1085,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 54,
+   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1022,7 +1104,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 55,
+   "execution_count": 59,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1049,7 +1131,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 56,
+   "execution_count": 60,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1067,7 +1149,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 57,
+   "execution_count": 61,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1080,7 +1162,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 58,
+   "execution_count": 62,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1096,7 +1178,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 59,
+   "execution_count": 63,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1106,7 +1188,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 60,
+   "execution_count": 64,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1135,7 +1217,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 61,
+   "execution_count": 65,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1149,7 +1231,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 62,
+   "execution_count": 66,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1172,7 +1254,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 63,
+   "execution_count": 67,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1191,7 +1273,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 64,
+   "execution_count": 68,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1204,7 +1286,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 65,
+   "execution_count": 69,
   "metadata": {
    "scrolled": true
   },
@ -1233,7 +1315,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 66,
+   "execution_count": 70,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1278,7 +1360,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 67,
+   "execution_count": 71,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1295,7 +1377,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 68,
+   "execution_count": 72,
   "metadata": {},
   "outputs": [],
   "source": [
@ -1378,7 +1460,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.8.10"
+   "version": "3.7.10"
  },
  "nav_menu": {},
  "toc": {