handson-ml/17_autoencoders_and_gans.ipynb

1840 lines
56 KiB
Plaintext
Raw Normal View History

2019-04-26 15:19:32 +02:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Chapter 17 Autoencoders and GANs**"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"_This notebook contains all the sample code in chapter 17._"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<table align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/ageron/handson-ml2/blob/master/17_autoencoders_and_gans.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
" </td>\n",
"</table>"
]
},
2019-04-26 15:19:32 +02:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Setup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures. We also check that Python 3.5 or later is installed (although Python 2.x may work, it is deprecated so we strongly recommend you use Python 3 instead), as well as Scikit-Learn ≥0.20 and TensorFlow ≥2.0."
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Python ≥3.5 is required\n",
"import sys\n",
"assert sys.version_info >= (3, 5)\n",
"\n",
"# Scikit-Learn ≥0.20 is required\n",
"import sklearn\n",
"assert sklearn.__version__ >= \"0.20\"\n",
"\n",
"try:\n",
" # %tensorflow_version only exists in Colab.\n",
" %tensorflow_version 2.x\n",
" IS_COLAB = True\n",
"except Exception:\n",
" IS_COLAB = False\n",
"\n",
"# TensorFlow ≥2.0 is required\n",
2019-04-26 15:19:32 +02:00
"import tensorflow as tf\n",
"from tensorflow import keras\n",
"assert tf.__version__ >= \"2.0\"\n",
"\n",
2020-04-06 08:57:38 +02:00
"if not tf.config.list_physical_devices('GPU'):\n",
" print(\"No GPU was detected. LSTMs and CNNs can be very slow without a GPU.\")\n",
" if IS_COLAB:\n",
" print(\"Go to Runtime > Change runtime and select a GPU hardware accelerator.\")\n",
"\n",
2019-04-26 15:19:32 +02:00
"# Common imports\n",
"import numpy as np\n",
"import os\n",
"\n",
"# to make this notebook's output stable across runs\n",
"np.random.seed(42)\n",
"tf.random.set_seed(42)\n",
"\n",
"# To plot pretty figures\n",
"%matplotlib inline\n",
"import matplotlib as mpl\n",
"import matplotlib.pyplot as plt\n",
"mpl.rc('axes', labelsize=14)\n",
"mpl.rc('xtick', labelsize=12)\n",
"mpl.rc('ytick', labelsize=12)\n",
"\n",
"# Where to save the figures\n",
"PROJECT_ROOT_DIR = \".\"\n",
"CHAPTER_ID = \"autoencoders\"\n",
"IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, \"images\", CHAPTER_ID)\n",
"os.makedirs(IMAGES_PATH, exist_ok=True)\n",
"\n",
"def save_fig(fig_id, tight_layout=True, fig_extension=\"png\", resolution=300):\n",
" path = os.path.join(IMAGES_PATH, fig_id + \".\" + fig_extension)\n",
" print(\"Saving figure\", fig_id)\n",
" if tight_layout:\n",
" plt.tight_layout()\n",
" plt.savefig(path, format=fig_extension, dpi=resolution)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A couple utility functions to plot grayscale 28x28 image:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def plot_image(image):\n",
2019-05-06 03:34:45 +02:00
" plt.imshow(image, cmap=\"binary\")\n",
2019-04-26 15:19:32 +02:00
" plt.axis(\"off\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# PCA with a linear Autoencoder"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Build 3D dataset:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"np.random.seed(4)\n",
"\n",
2019-05-06 03:34:45 +02:00
"def generate_3d_data(m, w1=0.1, w2=0.3, noise=0.1):\n",
" angles = np.random.rand(m) * 3 * np.pi / 2 - 0.5\n",
" data = np.empty((m, 3))\n",
" data[:, 0] = np.cos(angles) + np.sin(angles)/2 + noise * np.random.randn(m) / 2\n",
" data[:, 1] = np.sin(angles) * 0.7 + noise * np.random.randn(m) / 2\n",
" data[:, 2] = data[:, 0] * w1 + data[:, 1] * w2 + noise * np.random.randn(m)\n",
" return data\n",
2019-04-26 15:19:32 +02:00
"\n",
2019-05-06 03:34:45 +02:00
"X_train = generate_3d_data(60)\n",
"X_train = X_train - X_train.mean(axis=0, keepdims=0)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2019-05-06 03:34:45 +02:00
"Now let's build the Autoencoder..."
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"np.random.seed(42)\n",
"tf.random.set_seed(42)\n",
"\n",
"encoder = keras.models.Sequential([keras.layers.Dense(2, input_shape=[3])])\n",
"decoder = keras.models.Sequential([keras.layers.Dense(3, input_shape=[2])])\n",
"autoencoder = keras.models.Sequential([encoder, decoder])\n",
"\n",
"autoencoder.compile(loss=\"mse\", optimizer=keras.optimizers.SGD(lr=1.5))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"history = autoencoder.fit(X_train, X_train, epochs=20)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"codings = encoder.predict(X_train)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"fig = plt.figure(figsize=(4,3))\n",
2019-05-06 03:34:45 +02:00
"plt.plot(codings[:,0], codings[:, 1], \"b.\")\n",
2019-04-26 15:19:32 +02:00
"plt.xlabel(\"$z_1$\", fontsize=18)\n",
"plt.ylabel(\"$z_2$\", fontsize=18, rotation=0)\n",
2019-05-06 03:34:45 +02:00
"plt.grid(True)\n",
2019-04-26 15:19:32 +02:00
"save_fig(\"linear_autoencoder_pca_plot\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Stacked Autoencoders"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's use MNIST:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()\n",
2019-05-06 03:34:45 +02:00
"X_train_full = X_train_full.astype(np.float32) / 255\n",
"X_test = X_test.astype(np.float32) / 255\n",
2019-04-26 15:19:32 +02:00
"X_train, X_valid = X_train_full[:-5000], X_train_full[-5000:]\n",
"y_train, y_valid = y_train_full[:-5000], y_train_full[-5000:]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Train all layers at once"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's build a stacked Autoencoder with 3 hidden layers and 1 output layer (i.e., 2 stacked Autoencoders)."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
2019-05-06 03:34:45 +02:00
"source": [
"def rounded_accuracy(y_true, y_pred):\n",
" return keras.metrics.binary_accuracy(tf.round(y_true), tf.round(y_pred))"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
2019-04-26 15:19:32 +02:00
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"stacked_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.Dense(30, activation=\"selu\"),\n",
"])\n",
"stacked_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[30]),\n",
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"stacked_ae = keras.models.Sequential([stacked_encoder, stacked_decoder])\n",
2019-05-06 03:34:45 +02:00
"stacked_ae.compile(loss=\"binary_crossentropy\",\n",
" optimizer=keras.optimizers.SGD(lr=1.5), metrics=[rounded_accuracy])\n",
"history = stacked_ae.fit(X_train, X_train, epochs=20,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This function processes a few test images through the autoencoder and displays the original images and their reconstructions:"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 11,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"def show_reconstructions(model, images=X_valid, n_images=5):\n",
" reconstructions = model.predict(images[:n_images])\n",
" fig = plt.figure(figsize=(n_images * 1.5, 3))\n",
" for image_index in range(n_images):\n",
" plt.subplot(2, n_images, 1 + image_index)\n",
" plot_image(images[image_index])\n",
" plt.subplot(2, n_images, 1 + n_images + image_index)\n",
2019-04-26 15:19:32 +02:00
" plot_image(reconstructions[image_index])"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 12,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"show_reconstructions(stacked_ae)\n",
"save_fig(\"reconstruction_plot\")"
]
},
2019-05-06 03:34:45 +02:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Visualizing Fashion MNIST"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"np.random.seed(42)\n",
"\n",
"from sklearn.manifold import TSNE\n",
"\n",
"X_valid_compressed = stacked_encoder.predict(X_valid)\n",
"tsne = TSNE()\n",
"X_valid_2D = tsne.fit_transform(X_valid_compressed)\n",
"X_valid_2D = (X_valid_2D - X_valid_2D.min()) / (X_valid_2D.max() - X_valid_2D.min())"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"plt.scatter(X_valid_2D[:, 0], X_valid_2D[:, 1], c=y_valid, s=10, cmap=\"tab10\")\n",
"plt.axis(\"off\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's make this diagram a bit prettier:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"# adapted from https://scikit-learn.org/stable/auto_examples/manifold/plot_lle_digits.html\n",
"plt.figure(figsize=(10, 8))\n",
"cmap = plt.cm.tab10\n",
"plt.scatter(X_valid_2D[:, 0], X_valid_2D[:, 1], c=y_valid, s=10, cmap=cmap)\n",
"image_positions = np.array([[1., 1.]])\n",
"for index, position in enumerate(X_valid_2D):\n",
" dist = np.sum((position - image_positions) ** 2, axis=1)\n",
" if np.min(dist) > 0.02: # if far enough from other images\n",
" image_positions = np.r_[image_positions, [position]]\n",
" imagebox = mpl.offsetbox.AnnotationBbox(\n",
" mpl.offsetbox.OffsetImage(X_valid[index], cmap=\"binary\"),\n",
" position, bboxprops={\"edgecolor\": cmap(y_valid[index]), \"lw\": 2})\n",
" plt.gca().add_artist(imagebox)\n",
"plt.axis(\"off\")\n",
"save_fig(\"fashion_mnist_visualization_plot\")\n",
"plt.show()"
]
},
2019-04-26 15:19:32 +02:00
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tying weights"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is common to tie the weights of the encoder and the decoder, by simply using the transpose of the encoder's weights as the decoder weights. For this, we need to use a custom layer."
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 16,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"class DenseTranspose(keras.layers.Layer):\n",
" def __init__(self, dense, activation=None, **kwargs):\n",
" self.dense = dense\n",
" self.activation = keras.activations.get(activation)\n",
" super().__init__(**kwargs)\n",
" def build(self, batch_input_shape):\n",
2019-05-06 03:34:45 +02:00
" self.biases = self.add_weight(name=\"bias\",\n",
" shape=[self.dense.input_shape[-1]],\n",
2019-04-26 15:19:32 +02:00
" initializer=\"zeros\")\n",
" super().build(batch_input_shape)\n",
" def call(self, inputs):\n",
2019-06-10 11:42:31 +02:00
" z = tf.matmul(inputs, self.dense.weights[0], transpose_b=True)\n",
" return self.activation(z + self.biases)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 17,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-06-10 11:42:31 +02:00
"keras.backend.clear_session()\n",
2019-04-26 15:19:32 +02:00
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
2019-06-10 11:42:31 +02:00
"dense_1 = keras.layers.Dense(100, activation=\"selu\")\n",
"dense_2 = keras.layers.Dense(30, activation=\"selu\")\n",
"\n",
2019-04-26 15:19:32 +02:00
"tied_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
2019-06-10 11:42:31 +02:00
" dense_1,\n",
" dense_2\n",
2019-04-26 15:19:32 +02:00
"])\n",
2019-06-10 11:42:31 +02:00
"\n",
2019-04-26 15:19:32 +02:00
"tied_decoder = keras.models.Sequential([\n",
2019-06-10 11:42:31 +02:00
" DenseTranspose(dense_2, activation=\"selu\"),\n",
" DenseTranspose(dense_1, activation=\"sigmoid\"),\n",
2019-04-26 15:19:32 +02:00
" keras.layers.Reshape([28, 28])\n",
"])\n",
2019-06-10 11:42:31 +02:00
"\n",
2019-04-26 15:19:32 +02:00
"tied_ae = keras.models.Sequential([tied_encoder, tied_decoder])\n",
2019-06-10 11:42:31 +02:00
"\n",
2019-05-06 03:34:45 +02:00
"tied_ae.compile(loss=\"binary_crossentropy\",\n",
" optimizer=keras.optimizers.SGD(lr=1.5), metrics=[rounded_accuracy])\n",
"history = tied_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 18,
2019-04-26 15:19:32 +02:00
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"show_reconstructions(tied_ae)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Training one Autoencoder at a Time"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 19,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"def train_autoencoder(n_neurons, X_train, X_valid, loss, optimizer,\n",
" n_epochs=10, output_activation=None, metrics=None):\n",
2019-04-26 15:19:32 +02:00
" n_inputs = X_train.shape[-1]\n",
" encoder = keras.models.Sequential([\n",
" keras.layers.Dense(n_neurons, activation=\"selu\", input_shape=[n_inputs])\n",
" ])\n",
" decoder = keras.models.Sequential([\n",
" keras.layers.Dense(n_inputs, activation=output_activation),\n",
" ])\n",
" autoencoder = keras.models.Sequential([encoder, decoder])\n",
" autoencoder.compile(optimizer, loss, metrics=metrics)\n",
" autoencoder.fit(X_train, X_train, epochs=n_epochs,\n",
" validation_data=(X_valid, X_valid))\n",
2019-04-26 15:19:32 +02:00
" return encoder, decoder, encoder(X_train), encoder(X_valid)"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 20,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
2019-06-10 11:42:31 +02:00
"K = keras.backend\n",
2019-05-06 03:34:45 +02:00
"X_train_flat = K.batch_flatten(X_train) # equivalent to .reshape(-1, 28 * 28)\n",
"X_valid_flat = K.batch_flatten(X_valid)\n",
2019-04-26 15:19:32 +02:00
"enc1, dec1, X_train_enc1, X_valid_enc1 = train_autoencoder(\n",
2019-05-06 03:34:45 +02:00
" 100, X_train_flat, X_valid_flat, \"binary_crossentropy\",\n",
" keras.optimizers.SGD(lr=1.5), output_activation=\"sigmoid\",\n",
" metrics=[rounded_accuracy])\n",
2019-04-26 15:19:32 +02:00
"enc2, dec2, _, _ = train_autoencoder(\n",
2019-05-06 03:34:45 +02:00
" 30, X_train_enc1, X_valid_enc1, \"mse\", keras.optimizers.SGD(lr=0.05),\n",
2019-04-26 15:19:32 +02:00
" output_activation=\"selu\")"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 21,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"stacked_ae_1_by_1 = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
2019-05-06 03:34:45 +02:00
" enc1, enc2, dec2, dec1,\n",
2019-04-26 15:19:32 +02:00
" keras.layers.Reshape([28, 28])\n",
"])"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 22,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"show_reconstructions(stacked_ae_1_by_1)\n",
"plt.show()"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 23,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"stacked_ae_1_by_1.compile(loss=\"binary_crossentropy\",\n",
" optimizer=keras.optimizers.SGD(lr=0.1), metrics=[rounded_accuracy])\n",
"history = stacked_ae_1_by_1.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 24,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"show_reconstructions(stacked_ae_1_by_1)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using Convolutional Layers Instead of Dense Layers"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's build a stacked Autoencoder with 3 hidden layers and 1 output layer (i.e., 2 stacked Autoencoders)."
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 25,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"conv_encoder = keras.models.Sequential([\n",
" keras.layers.Reshape([28, 28, 1], input_shape=[28, 28]),\n",
" keras.layers.Conv2D(16, kernel_size=3, padding=\"SAME\", activation=\"selu\"),\n",
" keras.layers.MaxPool2D(pool_size=2),\n",
" keras.layers.Conv2D(32, kernel_size=3, padding=\"SAME\", activation=\"selu\"),\n",
" keras.layers.MaxPool2D(pool_size=2),\n",
" keras.layers.Conv2D(64, kernel_size=3, padding=\"SAME\", activation=\"selu\"),\n",
" keras.layers.MaxPool2D(pool_size=2)\n",
"])\n",
"conv_decoder = keras.models.Sequential([\n",
" keras.layers.Conv2DTranspose(32, kernel_size=3, strides=2, padding=\"VALID\", activation=\"selu\",\n",
" input_shape=[3, 3, 64]),\n",
" keras.layers.Conv2DTranspose(16, kernel_size=3, strides=2, padding=\"SAME\", activation=\"selu\"),\n",
" keras.layers.Conv2DTranspose(1, kernel_size=3, strides=2, padding=\"SAME\", activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"conv_ae = keras.models.Sequential([conv_encoder, conv_decoder])\n",
"\n",
"conv_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(lr=1.0),\n",
2019-05-06 03:34:45 +02:00
" metrics=[rounded_accuracy])\n",
2019-04-26 15:19:32 +02:00
"history = conv_ae.fit(X_train, X_train, epochs=5,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 26,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"conv_encoder.summary()\n",
"conv_decoder.summary()"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 27,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"show_reconstructions(conv_ae)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2019-05-06 03:34:45 +02:00
"# Recurrent Autoencoders"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 28,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"recurrent_encoder = keras.models.Sequential([\n",
" keras.layers.LSTM(100, return_sequences=True, input_shape=[28, 28]),\n",
" keras.layers.LSTM(30)\n",
2019-04-26 15:19:32 +02:00
"])\n",
2019-05-06 03:34:45 +02:00
"recurrent_decoder = keras.models.Sequential([\n",
" keras.layers.RepeatVector(28, input_shape=[30]),\n",
" keras.layers.LSTM(100, return_sequences=True),\n",
" keras.layers.TimeDistributed(keras.layers.Dense(28, activation=\"sigmoid\"))\n",
"])\n",
"recurrent_ae = keras.models.Sequential([recurrent_encoder, recurrent_decoder])\n",
"recurrent_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(0.1),\n",
" metrics=[rounded_accuracy])"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 29,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"history = recurrent_ae.fit(X_train, X_train, epochs=10, validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 30,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"show_reconstructions(recurrent_ae)\n",
"plt.show()"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Stacked denoising Autoencoder"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using Gaussian noise:"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 31,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"denoising_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
2019-05-06 03:34:45 +02:00
" keras.layers.GaussianNoise(0.2),\n",
2019-04-26 15:19:32 +02:00
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.Dense(30, activation=\"selu\")\n",
"])\n",
"denoising_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[30]),\n",
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"denoising_ae = keras.models.Sequential([denoising_encoder, denoising_decoder])\n",
"denoising_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(lr=1.0),\n",
2019-05-06 03:34:45 +02:00
" metrics=[rounded_accuracy])\n",
2019-04-26 15:19:32 +02:00
"history = denoising_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 32,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"noise = keras.layers.GaussianNoise(0.2)\n",
"show_reconstructions(denoising_ae, noise(X_valid, training=True))\n",
2019-04-26 15:19:32 +02:00
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using dropout:"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 33,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"dropout_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
" keras.layers.Dropout(0.5),\n",
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.Dense(30, activation=\"selu\")\n",
"])\n",
"dropout_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[30]),\n",
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"dropout_ae = keras.models.Sequential([dropout_encoder, dropout_decoder])\n",
"dropout_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(lr=1.0),\n",
2019-05-06 03:34:45 +02:00
" metrics=[rounded_accuracy])\n",
2019-04-26 15:19:32 +02:00
"history = dropout_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 34,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"dropout = keras.layers.Dropout(0.5)\n",
"show_reconstructions(dropout_ae, dropout(X_valid, training=True))\n",
"save_fig(\"dropout_denoising_plot\", tight_layout=False)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sparse Autoencoder"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2019-05-06 03:34:45 +02:00
"Let's build a simple stacked autoencoder, so we can compare it to the sparse autoencoders we will build. This time we will use the sigmoid activation function for the coding layer, to ensure that the coding values range from 0 to 1:"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 35,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"simple_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.Dense(30, activation=\"sigmoid\"),\n",
"])\n",
"simple_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[30]),\n",
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"simple_ae = keras.models.Sequential([simple_encoder, simple_decoder])\n",
"simple_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(lr=1.),\n",
2019-05-06 03:34:45 +02:00
" metrics=[rounded_accuracy])\n",
2019-04-26 15:19:32 +02:00
"history = simple_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 36,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"show_reconstructions(simple_ae)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's create a couple functions to print nice activation histograms:"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 37,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"def plot_percent_hist(ax, data, bins):\n",
" counts, _ = np.histogram(data, bins=bins)\n",
" widths = bins[1:] - bins[:-1]\n",
" x = bins[:-1] + widths / 2\n",
" ax.bar(x, counts / len(data), width=widths*0.8)\n",
" ax.xaxis.set_ticks(bins)\n",
" ax.yaxis.set_major_formatter(mpl.ticker.FuncFormatter(\n",
" lambda y, position: \"{}%\".format(int(np.round(100 * y)))))\n",
" ax.grid(True)"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 38,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"def plot_activations_histogram(encoder, height=1, n_bins=10):\n",
" X_valid_codings = encoder(X_valid).numpy()\n",
" activation_means = X_valid_codings.mean(axis=0)\n",
" mean = activation_means.mean()\n",
" bins = np.linspace(0, 1, n_bins + 1)\n",
"\n",
" fig, [ax1, ax2] = plt.subplots(figsize=(10, 3), nrows=1, ncols=2, sharey=True)\n",
" plot_percent_hist(ax1, X_valid_codings.ravel(), bins)\n",
" ax1.plot([mean, mean], [0, height], \"k--\", label=\"Overall Mean = {:.2f}\".format(mean))\n",
" ax1.legend(loc=\"upper center\", fontsize=14)\n",
" ax1.set_xlabel(\"Activation\")\n",
" ax1.set_ylabel(\"% Activations\")\n",
" ax1.axis([0, 1, 0, height])\n",
" plot_percent_hist(ax2, activation_means, bins)\n",
" ax2.plot([mean, mean], [0, height], \"k--\")\n",
" ax2.set_xlabel(\"Neuron Mean Activation\")\n",
" ax2.set_ylabel(\"% Neurons\")\n",
" ax2.axis([0, 1, 0, height])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's use these functions to plot histograms of the activations of the encoding layer. The histogram on the left shows the distribution of all the activations. You can see that values close to 0 or 1 are more frequent overall, which is consistent with the saturating nature of the sigmoid function. The histogram on the right shows the distribution of mean neuron activations: you can see that most neurons have a mean activation close to 0.5. Both histograms tell us that each neuron tends to either fire close to 0 or 1, with about 50% probability each. However, some neurons fire almost all the time (right side of the right histogram)."
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 39,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"plot_activations_histogram(simple_encoder, height=0.35)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's add $\\ell_1$ regularization to the coding layer:"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 40,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"sparse_l1_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.Dense(300, activation=\"sigmoid\"),\n",
" keras.layers.ActivityRegularization(l1=1e-3) # Alternatively, you could add\n",
" # activity_regularizer=keras.regularizers.l1(1e-3)\n",
" # to the previous layer.\n",
"])\n",
"sparse_l1_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[300]),\n",
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"sparse_l1_ae = keras.models.Sequential([sparse_l1_encoder, sparse_l1_decoder])\n",
"sparse_l1_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(lr=1.0),\n",
2019-05-06 03:34:45 +02:00
" metrics=[rounded_accuracy])\n",
2019-04-26 15:19:32 +02:00
"history = sparse_l1_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 41,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"show_reconstructions(sparse_l1_ae)"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 42,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"plot_activations_histogram(sparse_l1_encoder, height=1.)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's use the KL Divergence loss instead to ensure sparsity, and target 10% sparsity rather than 0%:"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 43,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"p = 0.1\n",
"q = np.linspace(0.001, 0.999, 500)\n",
"kl_div = p * np.log(p / q) + (1 - p) * np.log((1 - p) / (1 - q))\n",
"mse = (p - q)**2\n",
"mae = np.abs(p - q)\n",
"plt.plot([p, p], [0, 0.3], \"k:\")\n",
"plt.text(0.05, 0.32, \"Target\\nsparsity\", fontsize=14)\n",
"plt.plot(q, kl_div, \"b-\", label=\"KL divergence\")\n",
"plt.plot(q, mae, \"g--\", label=r\"MAE ($\\ell_1$)\")\n",
"plt.plot(q, mse, \"r--\", linewidth=1, label=r\"MSE ($\\ell_2$)\")\n",
"plt.legend(loc=\"upper left\", fontsize=14)\n",
"plt.xlabel(\"Actual sparsity\")\n",
"plt.ylabel(\"Cost\", rotation=0)\n",
"plt.axis([0, 1, 0, 0.95])\n",
"save_fig(\"sparsity_loss_plot\")"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 44,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"K = keras.backend\n",
2019-05-06 03:34:45 +02:00
"kl_divergence = keras.losses.kullback_leibler_divergence\n",
2019-04-26 15:19:32 +02:00
"\n",
"class KLDivergenceRegularizer(keras.regularizers.Regularizer):\n",
" def __init__(self, weight, target=0.1):\n",
" self.weight = weight\n",
" self.target = target\n",
" def __call__(self, inputs):\n",
" mean_activities = K.mean(inputs, axis=0)\n",
" return self.weight * (\n",
2019-05-06 03:34:45 +02:00
" kl_divergence(self.target, mean_activities) +\n",
" kl_divergence(1. - self.target, 1. - mean_activities))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 45,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"kld_reg = KLDivergenceRegularizer(weight=0.05, target=0.1)\n",
"sparse_kl_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.Dense(300, activation=\"sigmoid\", activity_regularizer=kld_reg)\n",
"])\n",
"sparse_kl_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[300]),\n",
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"sparse_kl_ae = keras.models.Sequential([sparse_kl_encoder, sparse_kl_decoder])\n",
"sparse_kl_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(lr=1.0),\n",
2019-05-06 03:34:45 +02:00
" metrics=[rounded_accuracy])\n",
2019-04-26 15:19:32 +02:00
"history = sparse_kl_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 46,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"show_reconstructions(sparse_kl_ae)"
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 47,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"plot_activations_histogram(sparse_kl_encoder)\n",
2019-05-06 03:34:45 +02:00
"save_fig(\"sparse_autoencoder_plot\")\n",
2019-04-26 15:19:32 +02:00
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2019-05-06 03:34:45 +02:00
"# Variational Autoencoder"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 48,
"metadata": {},
"outputs": [],
"source": [
"class Sampling(keras.layers.Layer):\n",
" def call(self, inputs):\n",
" mean, log_var = inputs\n",
" return K.random_normal(tf.shape(log_var)) * K.exp(log_var / 2) + mean "
]
},
{
"cell_type": "code",
"execution_count": 49,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
2019-05-06 03:34:45 +02:00
"codings_size = 10\n",
"\n",
"inputs = keras.layers.Input(shape=[28, 28])\n",
"z = keras.layers.Flatten()(inputs)\n",
"z = keras.layers.Dense(150, activation=\"selu\")(z)\n",
"z = keras.layers.Dense(100, activation=\"selu\")(z)\n",
"codings_mean = keras.layers.Dense(codings_size)(z)\n",
"codings_log_var = keras.layers.Dense(codings_size)(z)\n",
"codings = Sampling()([codings_mean, codings_log_var])\n",
"variational_encoder = keras.models.Model(\n",
" inputs=[inputs], outputs=[codings_mean, codings_log_var, codings])\n",
"\n",
"decoder_inputs = keras.layers.Input(shape=[codings_size])\n",
"x = keras.layers.Dense(100, activation=\"selu\")(decoder_inputs)\n",
"x = keras.layers.Dense(150, activation=\"selu\")(x)\n",
"x = keras.layers.Dense(28 * 28, activation=\"sigmoid\")(x)\n",
"outputs = keras.layers.Reshape([28, 28])(x)\n",
"variational_decoder = keras.models.Model(inputs=[decoder_inputs], outputs=[outputs])\n",
"\n",
"_, _, codings = variational_encoder(inputs)\n",
"reconstructions = variational_decoder(codings)\n",
"variational_ae = keras.models.Model(inputs=[inputs], outputs=[reconstructions])\n",
"\n",
"latent_loss = -0.5 * K.sum(\n",
" 1 + codings_log_var - K.exp(codings_log_var) - K.square(codings_mean),\n",
" axis=-1)\n",
"variational_ae.add_loss(K.mean(latent_loss) / 784.)\n",
"variational_ae.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\", metrics=[rounded_accuracy])\n",
"history = variational_ae.fit(X_train, X_train, epochs=25, batch_size=128,\n",
" validation_data=(X_valid, X_valid))"
2019-05-06 03:34:45 +02:00
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"show_reconstructions(variational_ae)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate Fashion Images"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [],
"source": [
"def plot_multiple_images(images, n_cols=None):\n",
" n_cols = n_cols or len(images)\n",
" n_rows = (len(images) - 1) // n_cols + 1\n",
" if images.shape[-1] == 1:\n",
" images = np.squeeze(images, axis=-1)\n",
" plt.figure(figsize=(n_cols, n_rows))\n",
" for index, image in enumerate(images):\n",
" plt.subplot(n_rows, n_cols, index + 1)\n",
" plt.imshow(image, cmap=\"binary\")\n",
" plt.axis(\"off\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's generate a few random codings, decode them and plot the resulting images:"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"\n",
"codings = tf.random.normal(shape=[12, codings_size])\n",
"images = variational_decoder(codings).numpy()\n",
"plot_multiple_images(images, 4)\n",
"save_fig(\"vae_generated_images_plot\", tight_layout=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's perform semantic interpolation between these images:"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"codings_grid = tf.reshape(codings, [1, 3, 4, codings_size])\n",
"larger_grid = tf.image.resize(codings_grid, size=[5, 7])\n",
"interpolated_codings = tf.reshape(larger_grid, [-1, codings_size])\n",
"images = variational_decoder(interpolated_codings).numpy()\n",
"\n",
"plt.figure(figsize=(7, 5))\n",
"for index, image in enumerate(images):\n",
" plt.subplot(5, 7, index + 1)\n",
" if index%7%2==0 and index//7%2==0:\n",
" plt.gca().get_xaxis().set_visible(False)\n",
" plt.gca().get_yaxis().set_visible(False)\n",
" else:\n",
" plt.axis(\"off\")\n",
" plt.imshow(image, cmap=\"binary\")\n",
"save_fig(\"semantic_interpolation_plot\", tight_layout=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Generative Adversarial Networks"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [],
"source": [
"np.random.seed(42)\n",
"tf.random.set_seed(42)\n",
"\n",
"codings_size = 30\n",
"\n",
"generator = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[codings_size]),\n",
" keras.layers.Dense(150, activation=\"selu\"),\n",
2019-04-26 15:19:32 +02:00
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
2019-05-06 03:34:45 +02:00
"discriminator = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
" keras.layers.Dense(150, activation=\"selu\"),\n",
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.Dense(1, activation=\"sigmoid\")\n",
"])\n",
"gan = keras.models.Sequential([generator, discriminator])"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 55,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"discriminator.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\")\n",
"discriminator.trainable = False\n",
"gan.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\")"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 56,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"batch_size = 32\n",
"dataset = tf.data.Dataset.from_tensor_slices(X_train).shuffle(1000)\n",
"dataset = dataset.batch(batch_size, drop_remainder=True).prefetch(1)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 57,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"def train_gan(gan, dataset, batch_size, codings_size, n_epochs=50):\n",
" generator, discriminator = gan.layers\n",
" for epoch in range(n_epochs):\n",
" print(\"Epoch {}/{}\".format(epoch + 1, n_epochs)) # not shown in the book\n",
" for X_batch in dataset:\n",
" # phase 1 - training the discriminator\n",
" noise = tf.random.normal(shape=[batch_size, codings_size])\n",
" generated_images = generator(noise)\n",
" X_fake_and_real = tf.concat([generated_images, X_batch], axis=0)\n",
" y1 = tf.constant([[0.]] * batch_size + [[1.]] * batch_size)\n",
" discriminator.trainable = True\n",
" discriminator.train_on_batch(X_fake_and_real, y1)\n",
" # phase 2 - training the generator\n",
" noise = tf.random.normal(shape=[batch_size, codings_size])\n",
" y2 = tf.constant([[1.]] * batch_size)\n",
" discriminator.trainable = False\n",
" gan.train_on_batch(noise, y2)\n",
" plot_multiple_images(generated_images, 8) # not shown\n",
" plt.show() # not shown"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 58,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"train_gan(gan, dataset, batch_size, codings_size, n_epochs=1)"
2019-04-26 15:19:32 +02:00
]
},
{
2019-05-06 03:34:45 +02:00
"cell_type": "code",
"execution_count": 59,
2019-04-26 15:19:32 +02:00
"metadata": {},
2019-05-06 03:34:45 +02:00
"outputs": [],
2019-04-26 15:19:32 +02:00
"source": [
2019-05-06 03:34:45 +02:00
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"noise = tf.random.normal(shape=[batch_size, codings_size])\n",
"generated_images = generator(noise)\n",
"plot_multiple_images(generated_images, 8)\n",
"save_fig(\"gan_generated_images_plot\", tight_layout=False)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 60,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"train_gan(gan, dataset, batch_size, codings_size)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deep Convolutional GAN"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 61,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
2019-05-06 03:34:45 +02:00
"codings_size = 100\n",
2019-04-26 15:19:32 +02:00
"\n",
2019-05-06 03:34:45 +02:00
"generator = keras.models.Sequential([\n",
" keras.layers.Dense(7 * 7 * 128, input_shape=[codings_size]),\n",
" keras.layers.Reshape([7, 7, 128]),\n",
" keras.layers.BatchNormalization(),\n",
" keras.layers.Conv2DTranspose(64, kernel_size=5, strides=2, padding=\"SAME\",\n",
" activation=\"selu\"),\n",
" keras.layers.BatchNormalization(),\n",
" keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2, padding=\"SAME\",\n",
" activation=\"tanh\"),\n",
"])\n",
"discriminator = keras.models.Sequential([\n",
" keras.layers.Conv2D(64, kernel_size=5, strides=2, padding=\"SAME\",\n",
" activation=keras.layers.LeakyReLU(0.2),\n",
" input_shape=[28, 28, 1]),\n",
" keras.layers.Dropout(0.4),\n",
" keras.layers.Conv2D(128, kernel_size=5, strides=2, padding=\"SAME\",\n",
" activation=keras.layers.LeakyReLU(0.2)),\n",
" keras.layers.Dropout(0.4),\n",
" keras.layers.Flatten(),\n",
" keras.layers.Dense(1, activation=\"sigmoid\")\n",
"])\n",
"gan = keras.models.Sequential([generator, discriminator])"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 62,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"discriminator.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\")\n",
"discriminator.trainable = False\n",
"gan.compile(loss=\"binary_crossentropy\", optimizer=\"rmsprop\")"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [],
"source": [
"X_train_dcgan = X_train.reshape(-1, 28, 28, 1) * 2. - 1. # reshape and rescale"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [],
"source": [
"batch_size = 32\n",
"dataset = tf.data.Dataset.from_tensor_slices(X_train_dcgan)\n",
"dataset = dataset.shuffle(1000)\n",
"dataset = dataset.batch(batch_size, drop_remainder=True).prefetch(1)"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [],
"source": [
"train_gan(gan, dataset, batch_size, codings_size)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"noise = tf.random.normal(shape=[batch_size, codings_size])\n",
"generated_images = generator(noise)\n",
"plot_multiple_images(generated_images, 8)\n",
"save_fig(\"dcgan_generated_images_plot\", tight_layout=False)"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2019-05-06 03:34:45 +02:00
"# Exercise Solutions"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2019-05-06 03:34:45 +02:00
"## Unsupervised pretraining"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's create a small neural network for MNIST classification:"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 67,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
2019-05-06 03:34:45 +02:00
"X_train_small = X_train[:500]\n",
"y_train_small = y_train[:500]\n",
"\n",
"classifier = keras.models.Sequential([\n",
" keras.layers.Reshape([28, 28, 1], input_shape=[28, 28]),\n",
" keras.layers.Conv2D(16, kernel_size=3, padding=\"SAME\", activation=\"selu\"),\n",
" keras.layers.MaxPool2D(pool_size=2),\n",
" keras.layers.Conv2D(32, kernel_size=3, padding=\"SAME\", activation=\"selu\"),\n",
" keras.layers.MaxPool2D(pool_size=2),\n",
" keras.layers.Conv2D(64, kernel_size=3, padding=\"SAME\", activation=\"selu\"),\n",
" keras.layers.MaxPool2D(pool_size=2),\n",
" keras.layers.Flatten(),\n",
" keras.layers.Dense(20, activation=\"selu\"),\n",
" keras.layers.Dense(10, activation=\"softmax\")\n",
"])\n",
"classifier.compile(loss=\"sparse_categorical_crossentropy\", optimizer=keras.optimizers.SGD(lr=0.02),\n",
" metrics=[\"accuracy\"])\n",
"history = classifier.fit(X_train_small, y_train_small, epochs=20, validation_data=(X_valid, y_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 68,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"import pandas as pd\n",
"pd.DataFrame(history.history).plot()\n",
"plt.show()"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 69,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"conv_encoder_clone = keras.models.clone_model(conv_encoder)\n",
"\n",
"pretrained_clf = keras.models.Sequential([\n",
" conv_encoder_clone,\n",
" keras.layers.Flatten(),\n",
" keras.layers.Dense(20, activation=\"selu\"),\n",
" keras.layers.Dense(10, activation=\"softmax\")\n",
"])"
2019-04-26 15:19:32 +02:00
]
},
{
2019-05-06 03:34:45 +02:00
"cell_type": "code",
"execution_count": 70,
2019-04-26 15:19:32 +02:00
"metadata": {},
2019-05-06 03:34:45 +02:00
"outputs": [],
2019-04-26 15:19:32 +02:00
"source": [
2019-05-06 03:34:45 +02:00
"conv_encoder_clone.trainable = False\n",
"pretrained_clf.compile(loss=\"sparse_categorical_crossentropy\",\n",
" optimizer=keras.optimizers.SGD(lr=0.02),\n",
" metrics=[\"accuracy\"])\n",
"history = pretrained_clf.fit(X_train_small, y_train_small, epochs=30,\n",
" validation_data=(X_valid, y_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 71,
2019-04-26 15:19:32 +02:00
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
2019-05-06 03:34:45 +02:00
"conv_encoder_clone.trainable = True\n",
"pretrained_clf.compile(loss=\"sparse_categorical_crossentropy\",\n",
" optimizer=keras.optimizers.SGD(lr=0.02),\n",
" metrics=[\"accuracy\"])\n",
"history = pretrained_clf.fit(X_train_small, y_train_small, epochs=20,\n",
" validation_data=(X_valid, y_valid))"
2019-05-06 03:34:45 +02:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Hashing Using a Binary Autoencoder"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
2019-04-26 15:19:32 +02:00
"\n",
2019-05-06 03:34:45 +02:00
"hashing_encoder = keras.models.Sequential([\n",
" keras.layers.Flatten(input_shape=[28, 28]),\n",
" keras.layers.Dense(100, activation=\"selu\"),\n",
" keras.layers.GaussianNoise(15.),\n",
" keras.layers.Dense(16, activation=\"sigmoid\"),\n",
"])\n",
"hashing_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(100, activation=\"selu\", input_shape=[16]),\n",
" keras.layers.Dense(28 * 28, activation=\"sigmoid\"),\n",
" keras.layers.Reshape([28, 28])\n",
"])\n",
"hashing_ae = keras.models.Sequential([hashing_encoder, hashing_decoder])\n",
"hashing_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.SGD(lr=1.0),\n",
" metrics=[rounded_accuracy])\n",
"history = hashing_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_valid, X_valid))"
2019-04-26 15:19:32 +02:00
]
},
{
"cell_type": "code",
2019-05-06 03:34:45 +02:00
"execution_count": 73,
2019-04-26 15:19:32 +02:00
"metadata": {},
"outputs": [],
2019-05-06 03:34:45 +02:00
"source": [
"show_reconstructions(hashing_ae)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {},
"outputs": [],
"source": [
"plot_activations_histogram(hashing_encoder)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [],
"source": [
"hashes = np.round(hashing_encoder.predict(X_valid)).astype(np.int32)\n",
"hashes *= np.array([[2**bit for bit in range(16)]])\n",
"hashes = hashes.sum(axis=1)\n",
"for h in hashes[:5]:\n",
" print(\"{:016b}\".format(h))\n",
"print(\"...\")"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"n_bits = 4\n",
"n_images = 8\n",
"plt.figure(figsize=(n_images, n_bits))\n",
"for bit_index in range(n_bits):\n",
" in_bucket = (hashes & 2**bit_index != 0)\n",
" for index, image in zip(range(n_images), X_valid[in_bucket]):\n",
" plt.subplot(n_bits, n_images, bit_index * n_images + index + 1)\n",
" plt.imshow(image, cmap=\"binary\")\n",
" plt.axis(\"off\")"
]
2021-02-16 03:23:29 +01:00
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exercise Solutions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. to 8.\n",
"\n",
"See Appendix A."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9.\n",
"_Exercise: Try using a denoising autoencoder to pretrain an image classifier. You can use MNIST (the simplest option), or a more complex image dataset such as [CIFAR10](https://homl.info/122) if you want a bigger challenge. Regardless of the dataset you're using, follow these steps:_\n",
"* Split the dataset into a training set and a test set. Train a deep denoising autoencoder on the full training set.\n",
"* Check that the images are fairly well reconstructed. Visualize the images that most activate each neuron in the coding layer.\n",
"* Build a classification DNN, reusing the lower layers of the autoencoder. Train it using only 500 images from the training set. Does it perform better with or without pretraining?"
]
},
{
"cell_type": "code",
"execution_count": 77,
2021-02-16 03:23:29 +01:00
"metadata": {},
"outputs": [],
"source": [
"[X_train, y_train], [X_test, y_test] = keras.datasets.cifar10.load_data()\n",
"X_train = X_train / 255\n",
"X_test = X_test / 255"
]
},
{
"cell_type": "code",
"execution_count": 78,
2021-02-16 03:23:29 +01:00
"metadata": {},
"outputs": [],
"source": [
"tf.random.set_seed(42)\n",
"np.random.seed(42)\n",
"\n",
"denoising_encoder = keras.models.Sequential([\n",
" keras.layers.GaussianNoise(0.1, input_shape=[32, 32, 3]),\n",
" keras.layers.Conv2D(32, kernel_size=3, padding=\"same\", activation=\"relu\"),\n",
" keras.layers.MaxPool2D(),\n",
" keras.layers.Flatten(),\n",
" keras.layers.Dense(512, activation=\"relu\"),\n",
"])"
]
},
{
"cell_type": "code",
"execution_count": 79,
2021-02-16 03:23:29 +01:00
"metadata": {},
"outputs": [],
"source": [
"denoising_encoder.summary()"
]
},
{
"cell_type": "code",
"execution_count": 80,
2021-02-16 03:23:29 +01:00
"metadata": {},
"outputs": [],
"source": [
"denoising_decoder = keras.models.Sequential([\n",
" keras.layers.Dense(16 * 16 * 32, activation=\"relu\", input_shape=[512]),\n",
" keras.layers.Reshape([16, 16, 32]),\n",
" keras.layers.Conv2DTranspose(filters=3, kernel_size=3, strides=2,\n",
" padding=\"same\", activation=\"sigmoid\")\n",
"])"
]
},
{
"cell_type": "code",
"execution_count": 81,
2021-02-16 03:23:29 +01:00
"metadata": {},
"outputs": [],
"source": [
"denoising_decoder.summary()"
]
},
{
"cell_type": "code",
"execution_count": 82,
2021-02-16 03:23:29 +01:00
"metadata": {},
"outputs": [],
"source": [
"denoising_ae = keras.models.Sequential([denoising_encoder, denoising_decoder])\n",
"denoising_ae.compile(loss=\"binary_crossentropy\", optimizer=keras.optimizers.Nadam(),\n",
" metrics=[\"mse\"])\n",
"history = denoising_ae.fit(X_train, X_train, epochs=10,\n",
" validation_data=(X_test, X_test))"
]
},
{
"cell_type": "code",
"execution_count": 83,
2021-02-16 03:23:29 +01:00
"metadata": {},
"outputs": [],
"source": [
"n_images = 5\n",
"new_images = X_test[:n_images]\n",
"new_images_noisy = new_images + np.random.randn(n_images, 32, 32, 3) * 0.1\n",
"new_images_denoised = denoising_ae.predict(new_images_noisy)\n",
"\n",
"plt.figure(figsize=(6, n_images * 2))\n",
"for index in range(n_images):\n",
" plt.subplot(n_images, 3, index * 3 + 1)\n",
" plt.imshow(new_images[index])\n",
" plt.axis('off')\n",
" if index == 0:\n",
" plt.title(\"Original\")\n",
" plt.subplot(n_images, 3, index * 3 + 2)\n",
" plt.imshow(np.clip(new_images_noisy[index], 0., 1.))\n",
" plt.axis('off')\n",
" if index == 0:\n",
" plt.title(\"Noisy\")\n",
" plt.subplot(n_images, 3, index * 3 + 3)\n",
" plt.imshow(new_images_denoised[index])\n",
" plt.axis('off')\n",
" if index == 0:\n",
" plt.title(\"Denoised\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 10.\n",
"_Exercise: Train a variational autoencoder on the image dataset of your choice, and use it to generate images. Alternatively, you can try to find an unlabeled dataset that you are interested in and see if you can generate new samples._\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 11.\n",
"_Exercise: Train a DCGAN to tackle the image dataset of your choice, and use it to generate images. Add experience replay and see if this helps. Turn it into a conditional GAN where you can control the generated class._\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
2019-04-26 15:19:32 +02:00
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2021-02-16 03:23:29 +01:00
"version": "3.7.9"
2019-04-26 15:19:32 +02:00
},
"nav_menu": {
"height": "381px",
"width": "453px"
},
"toc": {
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"threshold": 6,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
2020-04-06 08:57:38 +02:00
"nbformat_minor": 4
2019-04-26 15:19:32 +02:00
}