Commit Graph

22 Commits (0e164dc5824b48f3806c2a27589943732ec04d03)

Author SHA1 Message Date
Aurélien Geron 90e53af92c Install gym[box2d] on Colab for LunarLander-v2 policy gradients exercise solution 2021-03-20 10:46:02 +13:00
Aurélien Geron 9af016e341 Remove redundant heading for LunarLander-v2 policy gradients exercise solution 2021-03-20 10:40:02 +13:00
Aurélien Geron cfd0837f5c Add LunarLander-v2 Policy Gradients exercise solution 2021-03-20 10:04:52 +13:00
Aurélien Geron e9b5dce122 Fix auto-fire, add exercises, explain Space Invaders delta 2021-03-18 22:16:38 +13:00
Aurélien Geron c98ee19363 Fix AtariPreprocessingWithAutoFire typo 2021-03-10 10:45:24 +13:00
Aurélien Geron dd94101c5d Speed up training: I tuned learning rate for DQN variants, and added auto-FIRE for Blockout. Fixes #117 2021-03-09 22:21:08 +13:00
Aurélien Geron 5c8843a53b Merge pull request #290 from 8bitmp3/patch-1
Update (small) the reinforcement learning chapter
2021-03-02 12:11:47 +13:00
B D 64f0e05a94 Minor change on greedy policy variable usage
Chap 18, why not using directly the 'n_outputs' variable defined earlier, instead of hardcoded '2'
2021-02-28 12:02:23 +01:00
Aurélien Geron 749817ccfa Update libraries to latest version, including TensorFlow 2.4.1 and Scikit-Learn 0.24.1 2021-02-18 11:59:02 +13:00
Aurélien Geron 8ebdcffc6b Work around TF Agents issue: env.step(1) => env.step(np.array(1)) 2020-11-23 16:52:37 +13:00
8bitmp3 80f6cb27c0 Update (small) the reinforcement learning chapter 2020-10-17 15:04:51 +01:00
Aurélien Geron 7b3d280a86 Fix error in commented out code, fixes #89 2020-03-31 21:39:51 +13:00
Aurélien Geron cd4e2e1313 Add comment about the reshape operation from the training_step function 2020-03-12 22:51:36 +13:00
Aurélien Geron 49715d4b74 Fix bug in training_step: target_Q_values must be a column vector 2020-03-12 22:47:22 +13:00
Aurélien Geron d6fbc91cf2 Upgrade packages, and add environment-windows.yml 2019-12-14 18:58:01 +08:00
Aurélien Geron 88dccccd5f Make notebooks 14 to 19 runnable in Colab without changes 2019-11-06 21:06:55 +08:00
Aurélien Geron d8971f1767 Fix os.join() => os.path.join() 2019-10-12 18:05:41 +09:30
Aurélien Geron 4c3b7b9b06 Save agent's breakout performance to an animated gif 2019-05-28 09:30:16 +08:00
Aurélien Geron 3ef350ab4c Fix figure name and clarify a couple code examples 2019-05-27 20:35:00 +08:00
Aurélien Geron c5f4b41cf5 Fix breakout plot 2019-05-26 23:56:49 +08:00
Aurélien Geron 2edbb6e9d4 Add Reinforcement Learning notebook 2019-05-26 23:30:39 +08:00
Aurélien Geron 73a36f335f Add warning about TF issue regarding DenseFeatures and the Functional API, fixes #6 2019-05-15 20:23:24 +08:00