Notebooks zum Lektüre
 
Go to file
Aurélien Geron 3d3b610634 Remove use_cudnn_on_gpu=False in notebook for chapter 13 2017-09-15 16:52:54 +02:00
images Upgrade chapter 2 to sklearn 0.18 and ensure python 2 and python 3 both work 2016-11-03 23:47:11 +01:00
.gitignore Migrate to TensorFlow 0.11.0 2016-11-23 09:26:19 +01:00
01_the_machine_learning_landscape.ipynb removed quotes around "True" 2017-06-27 21:30:57 -04:00
02_end_to_end_machine_learning_project.ipynb Provide workaround and explanations about the breakage of LabelBinarizer by Scikit-Learn 0.19.0 2017-09-15 14:40:13 +02:00
03_classification.ipynb Fix multilabel typo 2017-07-07 21:56:30 +02:00
04_training_linear_models.ipynb Use 'np.random' rather than 'import numpy.random as rnd', and add random_state to make notebook's output constant 2017-06-06 15:16:46 +02:00
05_support_vector_machines.ipynb Use np.random.set_seed(42) and random_state=42 make notebook's output constant 2017-06-06 23:13:43 +02:00
06_decision_trees.ipynb Use 'np.random' rather than 'import numpy.random as rnd', and add random_state to make notebook's output constant 2017-06-06 15:15:20 +02:00
07_ensemble_learning_and_random_forests.ipynb Fix titles in figure 7-8 (learning rates should be 1 and 0.5) 2017-09-15 16:41:15 +02:00
08_dimensionality_reduction.ipynb Add exercise solutions for chapter 08 2017-06-26 00:09:23 +02:00
09_up_and_running_with_tensorflow.ipynb Use default seed=42 2017-06-06 22:44:01 +02:00
10_introduction_to_artificial_neural_networks.ipynb Use tf.set_random_seed(42) and more to make notebook's output constant 2017-06-06 23:12:21 +02:00
11_deep_learning.ipynb Fixes #56, bug in DNNClassifier for batch normalization 2017-07-13 11:13:37 +02:00
12_distributed_tensorflow.ipynb Use np.random.set_seed(42) and tf.set_random_seed(42) to make notebook's output constant, and simplify code in notebook 15 2017-06-07 17:52:59 +02:00
13_convolutional_neural_networks.ipynb Remove use_cudnn_on_gpu=False in notebook for chapter 13 2017-09-15 16:52:54 +02:00
14_recurrent_neural_networks.ipynb Replace rnd by np.random in notebook 14 2017-08-16 10:44:18 +02:00
15_autoencoders.ipynb Fix typo where n_hidden3 was used instead of n_outputs 2017-09-15 14:48:09 +02:00
16_reinforcement_learning.ipynb Use np.random.set_seed(42) and tf.set_random_seed(42) to make notebook's output constant 2017-06-08 15:44:00 +02:00
LICENSE First notebook added: matplotlib 2016-02-16 21:40:20 +01:00
README.md Fix typos and clarify some details in README.md 2017-04-30 10:39:48 +02:00
book_equations.ipynb Fix error in Equation 13-1 2017-06-26 16:06:40 +02:00
index.ipynb Add list of equations in the book 2017-06-26 12:14:57 +02:00
math_linear_algebra.ipynb changed order of dot product 2017-07-07 15:29:11 -04:00
requirements.txt Make gym optional 2017-02-17 14:52:28 +01:00
tools_matplotlib.ipynb fixed typo in tools_matplotlib.ipynb 2016-03-04 08:49:56 +01:00
tools_numpy.ipynb Remove one level of headers 2016-03-03 18:40:31 +01:00
tools_pandas.ipynb Add datasets, fix a few math linear algebra issues 2016-05-03 11:35:17 +02:00

README.md

Machine Learning Notebooks

This project aims at teaching you the fundamentals of Machine Learning in python. It contains the example code and solutions to the exercises in my O'Reilly book Hands-on Machine Learning with Scikit-Learn and TensorFlow:

book

Simply open the Jupyter notebooks you are interested in:

  • Using jupyter.org's notebook viewer
  • or by cloning this repository and running Jupyter locally. This option lets you play around with the code. In this case, follow the installation instructions below.

Installation

First, you will need to install git, if you don't have it already.

Next, clone this repository by opening a terminal and typing the following commands:

$ cd $HOME  # or any other development directory you prefer
$ git clone https://github.com/ageron/handson-ml.git
$ cd handson-ml

If you want to go through chapter 16 on Reinforcement Learning, you will need to install OpenAI gym and its dependencies for Atari simulations.

If you are familiar with Python and you know how to install Python libraries, go ahead and install the libraries listed in requirements.txt and jump to the Starting Jupyter section. If you need detailed instructions, please read on.

Python & Required Libraries

Of course, you obviously need Python. Python 2 is already preinstalled on most systems nowadays, and sometimes even Python 3. You can check which version(s) you have by typing the following commands:

$ python --version   # for Python 2
$ python3 --version  # for Python 3

Any Python 3 version should be fine, preferably ≥3.5. If you don't have Python 3, I recommend installing it (Python ≥2.6 should work, but it is deprecated so Python 3 is preferable). To do so, you have several options: on Windows or MacOSX, you can just download it from python.org. On MacOSX, you can alternatively use MacPorts or Homebrew. On Linux, unless you know what you are doing, you should use your system's packaging system. For example, on Debian or Ubuntu, type:

$ sudo apt-get update
$ sudo apt-get install python3

Another option is to download and install Anaconda. This is a package that includes both Python and many scientific libraries. You should prefer the Python 3 version.

If you choose to use Anaconda, read the next section, or else jump to the Using pip section.

Using Anaconda

When using Anaconda, you can optionally create an isolated Python environment dedicated to this project. This is recommended as it makes it possible to have a different environment for each project (e.g. one for this project), with potentially different libraries and library versions:

$ conda create -n mlbook python=3.5 anaconda
$ source activate mlbook

This creates a fresh Python 3.5 environment called mlbook (you can change the name if you want to), and it activates it. This environment contains all the scientific libraries that come with Anaconda. This includes all the libraries we will need (NumPy, Matplotlib, Pandas, Jupyter and a few others), except for TensorFlow, so let's install it:

$ conda install -n mlbook -c conda-forge tensorflow=1.0.0

This installs TensorFlow 1.0.0 in the mlbook environment (fetching it from the conda-forge repository). If you chose not to create an mlbook environment, then just remove the -n mlbook option.

Next, you can optionally install Jupyter extensions. These are useful to have nice tables of contents in the notebooks, but they are not required.

$ conda install -n mlbook -c conda-forge jupyter_contrib_nbextensions

You are all set! Next, jump to the Starting Jupyter section.

Using pip

If you are not using Anaconda, you need to install several scientific Python libraries that are necessary for this project, in particular NumPy, Matplotlib, Pandas, Jupyter and TensorFlow (and a few others). For this, you can either use Python's integrated packaging system, pip, or you may prefer to use your system's own packaging system (if available, e.g. on Linux, or on MacOSX when using MacPorts or Homebrew). The advantage of using pip is that it is easy to create multiple isolated Python environments with different libraries and different library versions (e.g. one environment for each project). The advantage of using your system's packaging system is that there is less risk of having conflicts between your Python libraries and your system's other packages. Since I have many projects with different library requirements, I prefer to use pip with isolated environments.

These are the commands you need to type in a terminal if you want to use pip to install the required libraries. Note: in all the following commands, if you chose to use Python 2 rather than Python 3, you must replace pip3 with pip, and python3 with python.

First you need to make sure you have the latest version of pip installed:

$ pip3 install --user --upgrade pip

The --user option will install the latest version of pip only for the current user. If you prefer to install it system wide (i.e. for all users), you must have administrator rights (e.g. use sudo pip3 instead of pip3 on Linux), and you should remove the --user option. The same is true of the command below that uses the --user option.

Next, you can optionally create an isolated environment. This is recommended as it makes it possible to have a different environment for each project (e.g. one for this project), with potentially very different libraries, and different versions:

$ pip3 install --user --upgrade virtualenv
$ virtualenv -p `which python3` env

This creates a new directory called env in the current directory, containing an isolated Python environment based on Python 3. If you installed multiple versions of Python 3 on your system, you can replace `which python3` with the path to the Python executable you prefer to use.

Now you must activate this environment. You will need to run this command every time you want to use this environment.

$ source ./env/bin/activate

Next, use pip to install the required python packages. If you are not using virtualenv, you should add the --user option (alternatively you could install the libraries system-wide, but this will probably require administrator rights, e.g. using sudo pip3 instead of pip3 on Linux).

$ pip3 install --upgrade -r requirements.txt

Great! You're all set, you just need to start Jupyter now.

Starting Jupyter

If you want to use the Jupyter extensions (optional, they are mainly useful to have nice tables of contents), you first need to install them:

$ jupyter contrib nbextension install --user

Then you can activate an extension, such as the Table of Contents (2) extension:

$ jupyter nbextension enable toc2/main

Okay! You can now start Jupyter, simply type:

$ jupyter notebook

This should open up your browser, and you should see Jupyter's tree view, with the contents of the current directory. If your browser does not open automatically, visit localhost:8888. Click on index.ipynb to get started!

Note: you can also visit http://localhost:8888/nbextensions to activate and configure Jupyter extensions.

Congrats! You are ready to learn Machine Learning, hands on!