Docker environment enhancements

- rearranged a Dockerfile to allow for incremental build
- switched running from root to "default new user"
- added the (easy to opt-out) configuration to use blank password
- added python-graphviz which enables DT visualization in notebooks
- added nbdime for "sensible notebook comparison"
- added custom command to "nbdiff" a notebook with its checkpointed version
- added simple README.md
main
ziembla 2017-11-27 17:16:51 +01:00
parent ca0f70a6b9
commit 4fa5beb93a
5 changed files with 117 additions and 17 deletions

View File

@ -1,19 +1,66 @@
FROM continuumio/anaconda3
WORKDIR /usr/src/project
COPY . /usr/src/project
RUN apt-get update && apt-get upgrade -y \
&& apt-get install -y \
libpq-dev \
build-essential \
git \
libpq-dev \
build-essential \
git \
sudo \
&& rm -rf /var/lib/apt/lists/*
&& rm -rf /var/lib/apt/lists/* \
RUN conda install -y -c conda-forge \
tensorflow=1.0.0 \
jupyter_contrib_nbextensions
&& conda install -y -c conda-forge tensorflow=1.0.0 \
&& conda install -y -c conda-forge jupyter_contrib_nbextensions \
ARG username
&& jupyter contrib nbextension install --user \
&& jupyter nbextension enable toc2/main
RUN adduser ${username} --gecos '' --disabled-password && \
echo "${username} ALL=(root) NOPASSWD:ALL" > /etc/sudoers.d/${username} && \
chmod 0440 /etc/sudoers.d/${username}
ENV HOME /home/${username}
WORKDIR ${HOME}/handson-ml
RUN chown ${username}:${username} ${HOME}/handson-ml
USER ${username}
RUN jupyter contrib nbextension install --user
RUN jupyter nbextension enable toc2/main
# INFO: Have RUN command below uncommented to for easy and constant URL (just localhost:8888)
# (by setting empty password instead of using a token)
# To avoid making a security hole the best would be to regenerate a hash for
# your own non-empty password and to replace the hash below.
# You can compute a password hash in the notebook, just run the code:
# from notebook.auth import passwd
# passwd()
RUN mkdir -p ${HOME}/.jupyter && \
echo 'c.NotebookApp.password = u"sha1:c6bbcba2d04b:f969e403db876dcfbe26f47affe41909bd53392e"' \
>> ${HOME}/.jupyter/jupyter_notebook_config.py
# INFO: Below - work in progress, nbdime not totally integrated, still:
# 1. enables diffing notebooks via nbdiff after connecting to container by "make exec" (docker exec)
# Use:
# nbd NOTEBOOK_NAME.ipynb
# to get nbdiff between checkpointed version and current version of the given notebook
# 2. allows decision tree visualization in notebook
# Use:
# from sklearn import tree
# from graphviz import Source
# Source(tree.export_graphviz(tree_clf, out_file=None, feature_names=iris.feature_names[2:]))
USER root
WORKDIR /
RUN conda install -y -c conda-forge nbdime
RUN conda install -y -c conda-forge python-graphviz
USER ${username}
WORKDIR ${HOME}/handson-ml
COPY docker/bashrc /tmp/bashrc
RUN cat /tmp/bashrc >> ${HOME}/.bashrc
RUN sudo rm -rf /tmp/bashrc

View File

@ -4,8 +4,10 @@ help:
run:
docker-compose up
exec:
docker-compose exec -ti hondson-ml /bin/bash
docker-compose exec handson-ml /bin/bash
build: stop .FORCE
docker-compose build
rebuild: stop .FORCE
docker-compose build --force-rm
stop:
docker stop handson-ml || true; docker rm handson-ml || true;

37
docker/README.md Normal file
View File

@ -0,0 +1,37 @@
# Hands-on Machine Learning in Docker :-)
This is the Docker configuration which allows you to run and tweak the book's notebooks without installing any dependencies on your machine!
OK, any except `docker`. With `docker-compose`. Well, you may also want `make` (but it is only used as thin layer to call a few simple `docker-compose` commands).
## Prerequisites
As stated, the two things you need is `docker` and `docker-compose`.
Follow the instructions on [Install Docker](https://docs.docker.com/engine/installation/) and [Install Docker Compose](https://docs.docker.com/compose/install/) for your environment if you haven't got `docker` already.
Some general knowledge about `docker` infrastructure might be useful (that's an interesting topic on its own) but is not strictly *required* to just run the notebooks.
## Usage
### Prepare the image (once)
Switch to `docker` directory here and run `make build` (or `docker-compose build`) to build your docker image. That may take some time but is only required once. Or perhaps a few times after you tweak something in a `Dockerfile`.
After the process is finished you have a `handson-ml` image, that will be the base for your experiments. You can confirm that looking on results of `docker images` command.
### Run the notebooks
Run `make run` (or just `docker-compose up`) to start the jupyter server inside the container (also named `handson-ml`, same as image). Just point your browser to <http://localhost:8888> or the URL printed on the screen and you're ready to play with the book's code!
The server runs in the directory containing the notebooks, and the changes you make from the browser will be persisted there.
You can close the server just by pressing `Ctrl-C` in terminal window.
### Run additional commands in container
Run `make exec` (or `docker-compose exec handson-ml bash`) while the server is running to run an additional `bash` shell inside the `handson-ml` container. Now you're inside the environment prepared within the image.
One of the usefull things that can be done there may be comparing versions of the notebooks using the `nbdiff` command if you haven't got `nbdime` installed locally (it is **way** better than plain `diff` for notebooks). See [Tools for diffing and merging of Jupyter notebooks]<https://github.com/jupyter/nbdime> for more details.
You may also try `nbd NOTEBOOK_NAME.ipynb` command (custom, defined in the Dockerfile) to compare one of your notebooks with its `checkpointed` version. To be precise, the output will tell you "what modifications should be re-played on the *manually saved* version of the notebook (located in `.ipynb_checkpoints` subdirectory) to update it to the *current* i.e. *auto-saved* version (given as command's argument - located in working directory)".

12
docker/bashrc Normal file
View File

@ -0,0 +1,12 @@
alias ll="ls -l"
nbd() {
DIRNAME=$(dirname "$1")
BASENAME=$(basename "$1" .ipynb)
WORKING_COPY=$DIRNAME/$BASENAME.ipynb
CHECKPOINT_COPY=$DIRNAME/.ipynb_checkpoints/$BASENAME-checkpoint.ipynb
# echo "How change $CHECKPOINT_COPY into $WORKING_COPY"
nbdiff "$CHECKPOINT_COPY" "$WORKING_COPY"
}

View File

@ -4,6 +4,8 @@ services:
build:
context: ../
dockerfile: ./docker/Dockerfile
args:
- username=devel
container_name: handson-ml
image: handson-ml
logging:
@ -13,5 +15,5 @@ services:
ports:
- "8888:8888"
volumes:
- ../:/usr/src/project
command: /opt/conda/bin/jupyter notebook --ip='*' --port=8888 --no-browser --allow-root
- ../:/home/devel/handson-ml
command: /opt/conda/bin/jupyter notebook --ip='*' --port=8888 --no-browser