Dilbert Comic Strip

Environment Magic with Jupyter and Docker

This article is the third in our new series of article for 2019; perspectives from our expert instructors. Keep your eye on this space to get the latest in what’s happening in code education, straight from our instructors. This month, Jonathan Frazier, Python/Data Engineering Instructor, covers some great tools you should know about.

If you enjoy using logic and creativity to solve problems, programming is not only a great career option, it can also be a lot of fun. Unfortunately, novice programmers often face a tricky chicken-and-egg scenario when getting started- in order to learn and practice programming, you need to know enough about how things work to get going. Setting up an environment for programming can be a frustrating experience, especially if all you really want to do is jump in and start programming. Luckily, Docker and Jupyter Notebooks are here to save the day.

Jupyter Notebooks

Jupyter (https://jupyter.org/) gives us a quick and easy way to get an interactive programming environments for our language of choice. Jupyter is a teaching tool that was originally developed for data science languages like Python, R, and Julia, but has been extended to run code in a number of other languages (Java, C#, Ruby, Javascript, etc).

 

Jupyter lets you create notebooks that can mix sections of interactive, editable code blocks with plain text or markdown. This is great for creating interactive programming lessons with explanations of code and additional content, or as a programming sketchpad where you can test out small chunks of code.

 

 

 

 

The notebooks run in the browser, so they’re easy to work with even if you’re not an experienced coder. This makes it easy to test out your code or work through programming examples, then save your work to share with others. There is also a great learning community built around Jupyter, so you can find lots of lessons and example notebooks online.

I’ll only be giving a basic introduction to Jupyter in this blog post, but there are a wealth of online resources that can help you learn more and find cool notebooks to run. Here are a couple resources that might be helpful if you want to dive deeper into Jupyter.

Learn more

Docker

Running code from your favorite programming language in the browser sounds like fun, but how do you set everything up? That’s where Docker comes in, a tool that is commonly used by development operations (DevOps) engineers to automate the process of building and deploying environments. It uses a special ‘Dockerfile’ that describes all steps to create the environment to build an image of that environment. You can then deploy that image to a Docker ‘container’, which  is like a virtual computer than runs on your machine.

Source: https://medium.com/platformer-blog/practical-guide-on-writing-a-dockerfile-for-your-application-89376f88b3b5

Anyone with the Dockerfile can build and access an exact copy of that environment, meaning environments can easily be shared, maintained, and version controlled. If you don’t know the exact steps for setting up the environment, there are a number of public Dockerfiles available for popular environments. Many of these are published on Dockerhub (the online community for sharing Dockerfiles), and can be downloaded directly using the docker command line tool.

To get access to Jupyter Notebooks (or any of the many programming environments, servers, or databases that have been ‘dockerized’), we just need a Dockerfile for it. Luckily, the good folks at Jupyter have put together a nice collection of Dockerfiles for a variety of Jupyter notebooks and pushed them to Dockerhub. We can pick any language we want to try, then use docker to build our environment and jump right into the fun part: writing code.

This blog only covers the Docker basics need to run a pre-built environment for Jupyter, but there’s a lot more that you can do with Docker. If you want to learn more about building your own Dockerfiles or deploying environments with docker-compose, here are a few helpful links to get you started.

Learn more

You will probably also want to get familiar with the Anaconda environment for Python (https://www.anaconda.com/). Jupyter is built to work with anaconda, and it also has a lot of convenient tools for installing libraries and maintaining your Python environment.

Docker Setup Guide

Tool Installation

First things first, we’ll need to get Docker set up to access all these powerful features. If you’re using Windows or Mac, you can install Docker Desktop, or if you’re using a Linux system you can install Docker using your package manager. Find the instructions for your operating system here: https://docs.docker.com/install

If you’re using Windows, you also want to install Git bash, because it will give you a Linux-like terminal that makes it easy to follow command line instructions which are usually written in Linux style. Here’s any easy to use get installer here: https://gitforwindows.org/

Running a Notebook

Once you have Docker on your computer, you can pull the image for the Jupyter notebook environment you prefer using the command line tool. Open your terminal (run Git bash on Windows, Terminal on Mac, or bash in Linux), then we build a datascience environment that will allow us to run code in Python, Julia, or R.

First we’ll download the files required to build the image with the following command:

docker pull jupyter/datascience-notebook:latest

This may take a few minutes (the files require several hundred megabytes of space to download). Then we can build our environment and deploy it to a Docker container with:

docker run -p 8888:8888 --name jupyter jupyter/datascience-notebook:latest

We use the ‘-p’ flag to tell docker to connect port 8888 of the computer to port 8888 on the container, so we can connect to the Jupyter notebook server inside the container. We also use the `–name` flag to give us a convenient name for referring to this container in later commands. After you run this command, it will show you the url for connecting to the notebook server, and give you a token for logging in. Open a new browser window and go the url to test it out:

http://localhost:8888/?token=<token_from_terminal_output>

We can create a new notebook file by clicking the new notebook dropdown button and selecting a notebook kernel to connect to. For example, selecting Python 3 will open a new Jupyter notebook for python, where you can start typing and running Python commands.

In [1]: print(‘Hello World!’)
Hello World!

Take some time to test out some python commands in your notebook, then we can take a few steps to make our environment easier to use. Close the Jupyter notebook server by going back to our terminal window and typing `Ctrl-C`. This should return us to the command line prompt.

We can show any running docker containers with `docker ps`. If you see the Jupyter container in your list, or if you get an error saying /jupyter is in use, you can run this command to stop and remove the container:

docker stop jupyter && docker rm jupyter

If you didn’t name your container, you can also use the container id from `docker ps` to stop and remove the container.

Docker commands

There are a lot of different things we can do with the Docker command line tool, like building new containers, checking out what’s running on our machine, or tagging images and sharing them online.

Source: https://docs.docker.com/get-started/part2/#recap-and-cheat-sheet-optional

If you want to learn more about the different commands available for Docker and the different flags that can be used, check out the help pages on the command line with `docker –help` or `docker <command> –help` (i.e. docker run –help).

Advanced Setup

To make things a little more convenient, we can specify the token Jupyter notebooks should use for login by setting an environment variable with the ‘–env’ flag. We can also mount a folder as a volume on our container to hold all our notebooks as we’re working on them. This lets us save our notebooks locally, while still being able to access them in our container.

First lets make a new folder for our notebooks in our home directory (represented by `~`):

mkdir ~/notebooks

Then we can start our notebook server with that folder mounted on the container as the default notebook directory (which is set to the path `/home/jovyan/work’ on the container):

docker run -d -p 8888:8888 --name jupyter \

--env JUPYTER_TOKEN=jupyter_notebook_token \

--volume "~/notebooks/":/home/jovyan/work \

jupyter/datascience-notebook:latest

The slashes at the end of the line on this command allow you to continue the command to the next line. You could also run the same command without slashes all on one line. Also, because we’re specifying the token ourselves, we don’t need to see the console output, so we can also add the `-d` flag to run our container in detached mode in the background. If we wanted to see the output from the container, we can just use `docker logs jupyter.

Now we can log into Jupyter notebook with the token we specified in our environment variable:

http://localhost:8888/?token=jupyter_notebook_token

If you’d like to try out notebooks you find online, you just need to put them in your notebooks folder and then reload your webpage.

More languages

Jupyter is great for data science, but many other programming languages can be used with Jupyter Notebooks and Docker. To help get you started, I have included a few modified Docker commands that can be used to setup each environment. Once you have docker installed, access to any environment accessible from Dockerhub is only one command away.

Jupyter Docker https://hub.docker.com/r/vhtec/jupyter-docker/

If you want to learn web development, you’ll need to pick up some skills with Javascript, and this environment also includes the popular PHP scripting language which is widely used online. Or you could sharpen your shell scripting skills by practicing with Bash and start on your path towards being a command line guru. If you want a challenge, try out C++, which is a go to language for video-game developers and fans of fast running code. All of these kernels are available in the Jupyter Docker notebook.

docker run -d -p 8888:8888 --name jupyter \
<--env JUPYTER_TOKEN=jupyter_notebook_token \
<--volume "~/notebooks/":/home/jovyan/work vhtec/jupyter-docker:latest

Beakerx ( https://hub.docker.com/r/beakerx/beakerx / http://beakerx.com/ )

BeakerX includes many of the languages that run on the Java Virtual Machine. This means you can try out Java, one of the most popular languages for object-oriented programming (OOP), as well as up and coming JVM languages like (Clojure, Scala, and Groovy).  it It also include Kotlin, making this is a great learning environment for aspiring Android app developers.

docker run -d -p 8888:8888 --name jupyter \
--env JUPYTER_TOKEN=jupyter_notebook_token \
--volume "~/notebooks/":/home/beakerx/work beakerx/beakerx:latest

BeakerX is a great environment to implement XKCD’s “bonding” program in Java

CSharp Notebook (https://hub.docker.com/r/tlinnet/csharp-notebook)

C# is the go to language for .Net developers, so this is a great place to start if you’re interested in development using the Microsoft stack.

docker run -d -p 8888:8888 --name jupyter \
--env JUPYTER_TOKEN=jupyter_notebook_token \
--volume "~/notebooks/":/home/jovyan/work \ tlinnet/csharp-notebook:notebooks

SciRuby iRuby Notebook(https://hub.docker.com/r/sciruby/iruby-notebook/)

Ruby is one of my personal favorite programming languages, probably due to its philosophy of being optimized for developer happiness. It’s a powerful and flexible language when used for scripting, and also features the first class web development framework Rails.

docker run -d -p 8888:8888 --name jupyter \
--volume "~/notebooks":/home/jovyan/work \
sciruby/iruby-notebook start-notebook.sh \ --NotebookApp.token=jupyter_notebook_token

This last command uses docker run to run the start-notebook.sh file inside our container and specifies the authentication token using an argument to that script.

Conclusion

Jupyter Notebooks and Docker are great tools for quickly getting things up and running to try out programming. Jupyter lets you try out code in pretty much any language you can think of. Pick an environment and get started with some of these programming challenge problems: https://adriann.github.io/programming_problems.html

If you enjoy programming and want to learn more and deepen your skills, check out some of the course offerings from The Software Guild. It’s a great way to learn from experienced programming professionals and can give you a leg up on starting a great new career in software development.

Learn Java: https://www.thesoftwareguild.com/coding-bootcamps/java-training/

Learn C# .Net: https://www.thesoftwareguild.com/coding-bootcamps/asp-net-c-sharp-training/