Update installation instructions to recommend Anaconda and point to Docker

2019-11-20 18:42:08 +08:00 · 2019-11-20 18:42:08 +08:00 · a4f41dd5fd
parent 00460cb555
commit a4f41dd5fd
2 changed files with 75 additions and 76 deletions
--- a/INSTALL.md
+++ b/INSTALL.md
@ -1,5 +1,7 @@
 # Installation
-To install this repository and run the Jupyter notebooks on your machine, you will first need git, which you probably have already. If not, you can download it from [git-scm.com](https://git-scm.com/).
+
+## Download this repository
+To install this repository and run the Jupyter notebooks on your machine, you will first need git, which you probably already have. Open a terminal and type `git` to check. If you do not have git, you can download it from [git-scm.com](https://git-scm.com/).

 Next, clone this repository by opening a terminal and typing the following commands:

@ -9,68 +11,79 @@ Next, clone this repository by opening a terminal and typing the following comma

 If you do not want to install git, you can instead download [master.zip](https://github.com/ageron/handson-ml2/archive/master.zip), unzip it, rename the resulting directory to `handson-ml2` and move it to your development directory.

-If you want to go through chapter 16 on Reinforcement Learning, you will need to [install OpenAI gym](https://gym.openai.com/docs) and its dependencies for Atari simulations.
+## Python 3 and the required libraries
+Next, you will need Python 3.6+ and a bunch of Python libraries. The simplest way to install these is to use Anaconda, which is a great cross-platform Python distribution for scientific computing. It comes bundled with many scientific libraries, including NumPy, Pandas, Matplotlib, Scikit-Learn and much more, so it's quite a large installation. If you choose to [download and install Anaconda](https://www.anaconda.com/distribution/), just make sure to install the Python 3 version. If you prefer a lighter weight Anaconda distribution, you can [install Miniconda](https://docs.conda.io/en/latest/miniconda.html), which contains the bare minimum to run the `conda` packaging tool.

-If you have a TensorFlow-compatible GPU card (NVidia card with Compute Capability ≥ 3.5), and you want TensorFlow to use it, then you should follow TensorFlow's [GPU installation instructions](https://tensorflow.org/install/gpu) to install the driver and libraries such as CUDA and CuDNN. Note that the installation instructions are still for TF 1.12, not TF 2.0, so you need to install CUDA 10.0 (not 9.2) with the corresponding NVidia driver (see NVidia's website for details) and CuDNN SDK 7.4 (not 7.2). Also edit `requirements.txt` to replace `tf-nightly-2.0-preview` with `tf-nightly-gpu-2.0-preview`.
+Once Anaconda or Miniconda is installed, open a terminal and run the following command. It will create a new `conda` enviromnent containing every library you will need (by default, the environment will be named `tf2`, but you can choose another name using the `-n` option):

-If you are familiar with Python and you know how to install Python libraries, go ahead and install the libraries listed in `requirements.txt` and jump to the [Starting Jupyter](#starting-jupyter) section. If you need detailed instructions, please read on.
+    $ conda env create -f environment.yml

-## Python & Required Libraries
-Of course, you obviously need Python. Python 2 is already preinstalled on most systems nowadays, and sometimes even Python 3. You can check which version(s) you have by typing the following commands:
+Next, activate the new environment:

-    $ python --version   # for Python 2
-    $ python3 --version  # for Python 3
+    $ conda activate tf2

-Right now, only Python 3.6 is supported (TensorFlow support for Python 3.7 is [coming soon](https://github.com/tensorflow/tensorflow/issues/20517)). If you don't have Python 3, I strongly recommend installing it (Python ≥2.7 may work with minor adjustments, but it is deprecated so Python 3 is preferable). To do so, you have several options: on Windows or MacOSX, you can just download it from [python.org](https://www.python.org/downloads/). On MacOSX, you can alternatively use [MacPorts](https://www.macports.org/) or [Homebrew](https://brew.sh/). If you are using Python 3.6 on MacOSX, you need to run the following command to install the `certifi` package of certificates because Python 3.6 on MacOSX has no certificates to validate SSL connections (see this [StackOverflow question](https://stackoverflow.com/questions/27835619/urllib-and-ssl-certificate-verify-failed-error)):
+> **Note**: if you don't like Anaconda for some reason, then you can install Python 3 and the required libraries manually (this is not recommended, unless you know what you are doing). For this, go through the [Manual Python Installation](#manual-python-installation) section then come back here and continue to the next sections.

-    $ /Applications/Python\ 3.6/Install\ Certificates.command
+## Using a GPU
+If you have a TensorFlow-compatible GPU card (NVidia card with Compute Capability ≥ 3.5), and you want TensorFlow to use it, then you should follow TensorFlow's [GPU installation instructions](https://tensorflow.org/install/gpu) to install the driver and libraries such as CUDA and CuDNN.

-On Linux, unless you know what you are doing, you should use your system's packaging system. For example, on Debian or Ubuntu, type:
+Also make sure to replace the `tensorflow` library with the `tensorflow-gpu` library (this will no longer be needed starting in TensorFlow 2.1):
+
+    $ pip uninstall tensorflow
+    $ pip install -U tensorflow-gpu
+
+## Reinforcement Learning Chapter Requirements
+If you want to go through chapter 18 on Reinforcement Learning, you will need to [install OpenAI gym](https://gym.openai.com/docs) and its dependencies for Atari simulations.
+
+## Starting Jupyter
+You're almost there! You just need to register the `tf2` conda environment to Jupyter. The notebooks in this project will defaut to the environment named `python3`, so it's best to register this environment using the name `python3` (if you prefer to use another name, you will have to select it in the "Kernel > Change kernel..." menu in Jupyter every time you open a notebook):
+
+    $ conda activate tf2
+    $ python3 -m ipykernel install --user --name=python3
+
+And that's it! You can now start Jupyter like this:
+
+    $ jupyter notebook
+
+This should open up your browser, and you should see Jupyter's tree view, with the contents of the current directory. If your browser does not open automatically, visit [localhost:8888](http://localhost:8888/tree). Click on `index.ipynb` to get started.
+
+Congrats! You are ready to learn Machine Learning, hands on!
+
+When you're done with Jupyter, you can close it by typing Ctrl-C in the Terminal window where you started it. Every time you want to work on this project, you will need to open a Terminal, and run:
+
+    $ cd $HOME # or whatever development directory you chose earlier
+    $ cd handson-ml2
+    $ conda activate tf2
+    $ jupyter notebook
+
+## Manual Python Installation
+**Not recommended**: use Anaconda or Miniconda instead, unless you know what you're doing.
+
+First, you will need Python 3.6 or 3.7. Some systems have it preinstalled. You can check by running the following command:
+
+    $ python3 --version
+
+If you have Python 3.6 or 3.7 already installed, that's great. If not, on Windows or MacOSX, you can just download it from [python.org](https://www.python.org/downloads/). On MacOSX, you can alternatively use [MacPorts](https://www.macports.org/) or [Homebrew](https://brew.sh/). If you are using Python 3.6+ on MacOSX, you need to run the following command to install the `certifi` package of certificates because Python 3.6+ on MacOSX has no certificates to validate SSL connections (see this [StackOverflow question](https://stackoverflow.com/questions/27835619/urllib-and-ssl-certificate-verify-failed-error)):
+
+    $ /Applications/Python\ 3.6/Install\ Certificates.command # or Python 3.7
+
+On Linux, unless you know what you are doing, you should probably use your system's packaging system. For example, on Debian or Ubuntu, type:

    $ sudo apt-get update
-    $ sudo apt-get install python3
+    $ sudo apt-get install python3 python3-pip

-Another option is to download and install [Anaconda](https://www.continuum.io/downloads). This is a package that includes both Python and many scientific libraries. You should prefer the Python 3 version.
+Next, you will need to install the Python libraries that are necessary for this project, in particular NumPy, Matplotlib, Pandas, Jupyter and TensorFlow (and a few others). For this, you can either use Python's integrated packaging system, pip, or you may prefer to use your system's own packaging system (if available, e.g. on Linux, or on MacOSX when using MacPorts or Homebrew). The advantage of using pip is that it is easy to create multiple isolated Python environments with different libraries and different library versions (e.g. one environment for each project). The advantage of using your system's packaging system is that there is less risk of having conflicts between your Python libraries and your system's other packages.

-If you choose to use Anaconda, read the next section, or else jump to the [Using pip](#using-pip) section.
+In this section, we will look at how to use pip. First, make sure you have the latest version of pip installed:

-## Using Anaconda
-
-**Warning**: this section does not work yet, since TensorFlow 2.0 is not yet available Anaconda repositories.
-
-When using Anaconda, you can optionally create an isolated Python environment dedicated to this project. This is recommended as it makes it possible to have a different environment for each project (e.g. one for this project), with potentially different libraries and library versions:
-
-    $ conda create -n mlbook python=3.6 anaconda
-    $ conda activate mlbook
-
-This creates a fresh Python 3.6 environment called `mlbook` (you can change the name if you want to), and it activates it. This environment contains all the scientific libraries that come with Anaconda. This includes all the libraries we will need (NumPy, Matplotlib, Pandas, Jupyter and a few others), except for TensorFlow, so let's install it:
-
-    $ conda install -n mlbook -c conda-forge tensorflow
-
-This installs the latest version of TensorFlow available for Anaconda (which is usually *not* the latest TensorFlow version) in the `mlbook` environment (fetching it from the `conda-forge` repository). If you chose not to create an `mlbook` environment, then just remove the `-n mlbook` option.
-
-Next, you can optionally install Jupyter extensions. These are useful to have nice tables of contents in the notebooks, but they are not required.
-
-    $ conda install -n mlbook -c conda-forge jupyter_contrib_nbextensions
-
-You are all set! Next, jump to the [Starting Jupyter](#starting-jupyter) section.
-
-## Using pip 
-
-If you are not using Anaconda, you need to install several scientific Python libraries that are necessary for this project, in particular NumPy, Matplotlib, Pandas, Jupyter and TensorFlow (and a few others). For this, you can either use Python's integrated packaging system, pip, or you may prefer to use your system's own packaging system (if available, e.g. on Linux, or on MacOSX when using MacPorts or Homebrew). The advantage of using pip is that it is easy to create multiple isolated Python environments with different libraries and different library versions (e.g. one environment for each project). The advantage of using your system's packaging system is that there is less risk of having conflicts between your Python libraries and your system's other packages. Since I have many projects with different library requirements, I prefer to use pip with isolated environments. Moreover, the pip packages are usually the most recent ones available, while Anaconda and system packages often lag behind a bit.
-
-These are the commands you need to type in a terminal if you want to use pip to install the required libraries. Note: in all the following commands, if you chose to use Python 2 rather than Python 3, you must replace `pip3` with `pip`, and `python3` with `python`.
-
-First you need to make sure you have the latest version of pip installed:
-
-    $ python3 -m pip install --user --upgrade pip setuptools
+    $ python3 -m pip install --user -U pip setuptools

 The `--user` option will install the latest version of pip only for the current user. If you prefer to install it system wide (i.e. for all users), you must have administrator rights (e.g. use `sudo python3 -m pip` instead of `python3 -m pip` on Linux), and you should remove the `--user` option. The same is true of the command below that uses the `--user` option.

-Next, you can optionally create an isolated environment. This is recommended as it makes it possible to have a different environment for each project (e.g. one for this project), with potentially very different libraries, and different versions:
+Next, you can optionally create an isolated environment. This is recommended as it makes it possible to have a different environment for each project (e.g., one for this project), with potentially very different libraries, and different versions:

-    $ python3 -m pip install --user --upgrade virtualenv
-    $ virtualenv -p `which python3` env
+    $ python3 -m pip install --user -U virtualenv
+    $ python3 -m virtualenv -p `which python3` env

 This creates a new directory called `env` in the current directory, containing an isolated Python environment based on Python 3. If you installed multiple versions of Python 3 on your system, you can replace `` `which python3` `` with the path to the Python executable you prefer to use.

@ -82,17 +95,8 @@ On Windows, the command is slightly different:

    $ .\env\Scripts\activate

-Next, use pip to install the required python packages. If you are not using virtualenv, you should add the `--user` option (alternatively you could install the libraries system-wide, but this will probably require administrator rights, e.g. using `sudo pip3` instead of `pip3` on Linux).
+Next, use pip to install the required python packages. If you are not using virtualenv, you should add the `--user` option (alternatively you could install the libraries system-wide, but this will probably require administrator rights, e.g. using `sudo python3` instead of `python3` on Linux).

-    $ python3 -m pip install --upgrade -r requirements.txt
+    $ python3 -m pip install -U -r requirements.txt

-Great! You're all set, you just need to start Jupyter now.
-
-## Starting Jupyter
-Okay! You can now start Jupyter, simply type:
-
-    $ jupyter notebook
-
-This should open up your browser, and you should see Jupyter's tree view, with the contents of the current directory. If your browser does not open automatically, visit [localhost:8888](http://localhost:8888/tree). Click on `index.ipynb` to get started!
-
-Congrats! You are ready to learn Machine Learning, hands on!
+Great! You now have Python 3.6 or 3.7 installed with all the required libraries. You can now resume the installation instructions starting at the [Using a GPU](#using-a-gpu) section above. However, you will need to replace the `pip` command with `python3 -m pip`, and replace `conda activate tf2` with `source ./env/bin/activate` (or `.\env\Scripts\activate` on Windows).
--- a/README.md
+++ b/README.md
@ -10,12 +10,12 @@ python. It contains the example code and solutions to the exercises in the secon

 ## Quick Start

-### Want to play with these notebooks without having to install anything?
+### Want to play with these notebooks online without having to install anything?
 Use any of the following services.

-**WARNING**: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you save anything you care about.
+**WARNING**: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you download any data you care about.

-* Open this repository in [Colaboratory](https://colab.research.google.com/github/ageron/handson-ml2/blob/master/):
+* **Recommended**: open this repository in [Colaboratory](https://colab.research.google.com/github/ageron/handson-ml2/blob/master/):
 <a href="https://colab.research.google.com/github/ageron/handson-ml2/blob/master/"><img src="https://colab.research.google.com/img/colab_favicon.ico" width="90" /></a>

 * Or open it in [Binder](https://mybinder.org/v2/gh/ageron/handson-ml2/master):
@ -28,32 +28,27 @@ Use any of the following services.

 ### Just want to quickly look at some notebooks, without executing any code?

-Browse this repository using [jupyter.org's notebook viewer](http://nbviewer.jupyter.org/github/ageron/handson-ml2/blob/master/index.ipynb):
-<a href="http://nbviewer.jupyter.org/github/ageron/handson-ml2/blob/master/index.ipynb"><img src="https://jupyter.org/assets/nav_logo.svg" width="150" /></a>
+Browse this repository using [jupyter.org's notebook viewer](https://nbviewer.jupyter.org/github/ageron/handson-ml2/blob/master/index.ipynb):
+<a href="https://nbviewer.jupyter.org/github/ageron/handson-ml2/blob/master/index.ipynb"><img src="https://jupyter.org/assets/nav_logo.svg" width="150" /></a>

-_Note_: [github.com's notebook viewer](https://github.com/ageron/handson-ml2/blob/master/index.ipynb) also works but it is slower and the math equations are not always displayed correctly.
+_Note_: [github.com's notebook viewer](index.ipynb) also works but it is slower and the math equations are not always displayed correctly.
+
+### Want to run this project using a Docker image?
+Read the [Docker instructions](https://github.com/ageron/handson-ml2/tree/master/docker).

 ### Want to install this project on your own machine?

-If you have a working Python 3.5+ environment and git is installed, then this project and its dependencies can be installed with pip. Open a terminal and run the following commands (do not type the `$` signs, they just indicate that this is a terminal command):
-
-    $ git clone https://github.com/ageron/handson-ml2.git
-    $ cd handson-ml2
-    $ python3 -m pip install --user --upgrade pip setuptools
-    $ # Read `requirements.txt` if you want to use a GPU.
-    $ python3 -m pip install --user --upgrade -r requirements.txt
-    $ jupyter notebook
-
-Or using Anaconda:
+If you have [Anaconda](https://www.anaconda.com/distribution/) (or [Miniconda](https://docs.conda.io/en/latest/miniconda.html)) and git installed, then this project and its dependencies can be installed quite simply. Open a terminal and run the following commands (do not type the first `$` signs on each line, they just indicate that these are terminal commands):

    $ git clone https://github.com/ageron/handson-ml2.git
    $ cd handson-ml2
    $ # Read `environment.yml` if you want to use a GPU.
    $ conda env create -f environment.yml
    $ conda activate tf2
+    $ python3 -m ipykernel install --user --name=python3
    $ jupyter notebook

-If you need more detailed installation instructions, read the [detailed installation instructions](INSTALL.md).
+If you prefer to install Python and the required libraries manually, or if you need further instructions, read the [detailed installation instructions](INSTALL.md).

 ## Contributors
 I would like to thank everyone who contributed to this project, either by providing useful feedback, filing issues or submitting Pull Requests. Special thanks go to Haesun Park who helped on some of the exercise solutions, and to Steven Bunkley and Ziembla who created the `docker` directory.