diff --git a/16_reinforcement_learning.ipynb b/16_reinforcement_learning.ipynb index 9459440..15c258e 100644 --- a/16_reinforcement_learning.ipynb +++ b/16_reinforcement_learning.ipynb @@ -1410,7 +1410,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Note: the `preprocess_observation()` function is slightly different from the one in the book: instead of representing pixels as 64-bit floats from -1.0 to 1.0, it represents them as 8-bit integers from -128 to 127. The benefit is that the replay memory will take up about 6.5 GB of RAM instead of 52 GB. The reduced precision has no impact on training." + "Note: the `preprocess_observation()` function is slightly different from the one in the book: instead of representing pixels as 64-bit floats from -1.0 to 1.0, it represents them as signed bytes (from -128 to 127). The benefit is that the replay memory will take up roughly 8 times less RAM (about 6.5 GB instead of 52 GB). The reduced precision has no visible impact on training." ] }, { @@ -1475,7 +1475,7 @@ "initializer = tf.contrib.layers.variance_scaling_initializer()\n", "\n", "def q_network(X_state, name):\n", - " prev_layer = X_state\n", + " prev_layer = X_state / 128.0 # scale pixel intensities to the [-1.0, 1.0] range.\n", " with tf.variable_scope(name) as scope:\n", " for n_maps, kernel_size, strides, padding, activation in zip(\n", " conv_n_maps, conv_kernel_sizes, conv_strides,\n",