Assuming you have already completed the steps for downloading the Google Cloud SDK Docker image and Registering to use Google Compute Engine as outlined in the Week 13 README, we can now proceed to learning how to use the Google Compute Engine to run our class Docker containers. There are two concurrent processes to keep track of in the rest of this Notebook. The first set of processes take place in a Web browser where you register to use the Google Compute Engine and subsequently create, monitor, and delete Virtual Instances. The second set of processes take place locally in your detached Google Cloud SDK Docker container.
Before proceeding, start Boot2Docker and at the Boot2Docker prompt, enter
docker ps -a | grep gcloud
If this command returns nothing, you should try executing this command to create and initialize the Google Cloud SDK Docker container:
docker run -it --name gcloud-config google/cloud-sdk gcloud auth login
After running this command, you should be able to see an exited Docker container.
bash-3.2$ docker ps -a | grep gcloud
7db61cac9698 google/cloud-sdk:latest "gcloud auth login" 52 seconds ago
Exited (0) 17 seconds ago gcloud-config
If this previous step fails, review the steps listed in the Week 8 README.
On the other hand, if you have one or more containers listed (most of them may have Exited) you should be ready to execute Docker commands to interact with Google Compute Engine. First, however, you might want to update the Google Cloud SDK Docker container by issuing the following command:
docker run -it --volumes-from gcloud-config google/cloud-sdk \
gcloud components update
This command may return a message that All components are up to date,
or alternatively you may be presented with a list of components along
with a prompt asking if you wish to continue. Enter Y
for yes and
allow the updates to be downloaded and installed. Once this process is
completed you should proceed to registering for access to Google Compute
Engine.
To start the first set of processes, you must register to use the Google Compute Engine. Normally this step requires payment since you are actively using compute resources owned by Google. However, we will sign up for the free sixty-day trial period. The first step is to vist Google Compute Engine in a Web browser (click the link and you should be taken to the website). You will first need to provide your Google credentials, which might require a two-phase verification (for example, Google may send a verification code to your cell phone).
After you have authenticated, you will need to create a new project. You can do this by selecting the Create a new project item from the drop down menu in the center of the page and entering a suitable name for you project (for example, RP Data Science 2015). Your Web page should resemble the following screenshot:
When you have completed this page, copy the Project ID (circled in red in the above screenshot, but your id will be different) to a text file for later use, select the checkbox stating that you agree to the terms of service, and click the Create button. A message window showing the Google Compute Engine Activities may appear as the new project is created.
Once the project has been created, we need to enter personal information to complete the Project creation (note this will only need to be done for the first project. To enter this information, select the Billing & settings item under the Projects tabe on the left-hand side of the web page (shown in the following image).
Even though this is a free trail, Google will collect your personal information and a Credit Card to (according to Google) verify that there is a person completing the form. Multiple assurances are given that no billing will be applied to your account without your express agreement. If you use all of your free resources, your account will be locked until you provide payment. After selecting Billing & services you will be presented with a new Web page that provides this assurance, your project information, and a button entitled Sign up for a free trail, an example page is shown below.
Click on the free trail button, and enter your personal information. Your account type will most likely be Individual (for this class at least). Once you have completed the account sign-up phase, you will need to click on the Accept and start free trail button.
At this point, you should switch back to your Boot2Docker prompt and enter the following Docker command to assign your project id to your Google Cloud SDK Docker container. For example, my project-id is proven-reporter-88419 so I used the following Docker command:
docker run -it --volumes-from gcloud-config google/cloud-sdk \
gcloud config set project proven-reporter-88419
You will need to use your project-id as indicated by Google on your project Web page.
You should now switch back to the web-based Google Developers Console to finish setting up your Google Compute Engine. First, select APIs under the APIs & authmenu item on the left hand side of the Webpage. You will be presented with a list of APIs with sliders indicating the status of the API. Scroll down to the Google Compute Engine slider, which you click on to change from Off to On, this should open a new Webpage as shown in the following screenshot.
Now the Google Compute Engine has been enabled, we can create a virtual machine to run our Docker container. On the left hand menu, select VM instances under the Compute menu item. This will allow you to either Create instance or Take the quickstart as shown below.
Click on Create instance, which will open a long form in which you can customize your new Virtual Machine instance. You will want to enter an instance name (for example, rppds-15), select Allow HTTP Traffic, select the us-central1-a zone, a standard n1-standard-1 Machine Type, and ubuntu-1404-trusty Image, as shown in the following screenshot.
Once you have filled out the form, click the Create button. You may be presented with an Activities window similar to the one shown below. This simply provides an update to the status of creating your new virtual machine.
After your new virtual machine has been created, you will be presented with a Google Compute Engine dashboard that provides an overview of your new virtual machine. At the moment, as shown below, your dashboard is fairly empty, but this will change as you begin to use your new virtual machine to run a Docker container.
Circled in red on the dashboard image above are two key pieces of information you need to retain, the project name and the external IP address of your virtual machine (this will change if you stop and restart your virtual machine).
At this point, you can return to your Google Cloud SDK Docker container and test the connection to your new Google Compute Engine by entering the following command (shown below with my output).
bash-3.2$ docker run --rm -ti --volumes-from gcloud-config google/cloud-sdk
gcutil listinstances
+----------+---------------+---------+----------------+-----------------+
| name | zone | status | network-ip | external-ip |
+----------+---------------+---------+----------------+-----------------+
| rppds-15 | us-central1-a | RUNNING | 10.240.225.126 | 130.211.153.108 |
+----------+---------------+---------+----------------+-----------------+
You now have a virtual machine running in Google Compute Engine, and the
ability to connect to this virtual machine from your Google Cloud SDK
Docker container. The next step is to actually connect by using ssh
with the Google Docker container. This is done by issuing an ssh
command from the Google Cloud SDK Docker container:
docker run --rm -ti --volumes-from gcloud-config google/cloud-sdk \
gcloud compute ssh rppds-15 --zone us-central1-a
You will be prompted for a Pass Phrase, you can safely hit enter twice without entering anything, and after a period of time when the connection is established, you will eventually be given a prompt in your new virtual machine running in Google Compute Engine.
This virtual machine is running the Ubuntu operating system (since that
is what we specified during the creation of the virtual machine). Thus
the first step should be to update the package list, which you do by
entering sudo apt-get update
at the vm shell prompt. After this you
should apply any available software updates, which is doen by entering
sudo apt-get -y upgrade
at the VM prompt. We specified the -y
flag
to automatically answer yes to any queries since we will want to apply
any upgrades. Finally, we should apply any distribution upgrades, which
is done by entering sudo apt-get -y dist-upgrade
.
Once the update and upgrade process is completed you need to download
the Docker tools for Ubuntu by entering sudo apt-get -y install
docker.io
at the vm prompt. During this process, the apt-get
program
would normally ask if you want to continue and download the necessary
files, but since we used the -y
flag we apt-get
will automatically
download all necessary files. The next step is ensure tab completion,
which is done by executing a shell script: source
/etc/bash_completion.d/docker.io
.
Finally, we need to ensure we have the latest docker components by
processing a web-accesible file on the docker.com website. The easiest
technique for doing this is to use the curl
program, which is similar
to wget
, except in this case we will download the file and pass it
directly into a Unix shell for immediate processing; the format for this
command is curl -sSL https://get.docker.com/ubuntu/ | sudo sh
.
To summarize, we need to execute the following commands in order:
sudo apt-get update
sudo apt-get -y upgrade
sudo apt-get -y dist-upgrade
sudo apt-get -y install docker.io
source /etc/bash_completion.d/docker.io
curl -sSL https://get.docker.com/ubuntu/ | sudo sh
After these steps have all been completed, you will now have an Ubuntu virtual machine that is ready to run Docker containers.
You now have a working Ubuntu virtual machine that is ready to run
Docker commands. Since you are no longer using Boot2Docker, you would
normally need to enter sudo
before any docker commands. However, since
we are the root user in this Ubuntu vm (because we did not create any
user accounts) we can enter Docker commands in the same manner as we
have at the Boot2Docker shell.
The first step is to download our course Docker image (alternatively,
you could also download the Hadoop container) by entering docker pull
lcdm/info490
at the vm prompt. Next, we need to make a new directory that
will contain our course material. In keeping with tradition, we will
call the directory i2ds, thus you need to enter mkdir i2ds
followed
by cd i2ds
. Now we can clone the course git repository so that we can
easily run the IPython Notebooks for the course. We do this by entering
git clone https://github.com/INFO490/spring2015
.
Now we are ready to run a Docker IPython Notebook server. We can now
change the default port mapping such that we can use the standard web
port 80 instead of the special port 8888. We also need to map our shared
folder into our Notebook server. In the end, our docker run
command
takes the full form: docker run -d -p 80:8888 -e "PASSWORD=temp123" -v
/root/i2ds:/notebooks/i2ds lcdm/info490
. At this point, you are running
our course Docker image inside the Google Compute Engine virtual machine
as shown in the following screenshot.
To summarize, the following steps are required to successfully run our course Docker container to enable Web-accesible IPython Notebooks:
docker pull lcdm/info490
cd i2ds
git clone https://github.com/INFO490/spring2015
-v /root/i2ds:/notebooks/i2ds lcdm/info490`
After these steps have been executed and the Docker container is running, you can open a Web browser to view the Google Compute Engine hosted course website. The URL for this server was the external IP address that was listed on your Google Compute Engine Dashboard and is also available by using the Google Cloud SDK Docker container to list the instances you have running on the Google Compute Engine. This second options can be done on your host machine by entering the following command at a Boot2docker prompt:
docker run --rm -ti --volumes-from gcloud-config google/cloud-sdk \
gcutil listinstances
In the case outlined in this Notebook, the external IP address is
130.211.153.108
. Entering this address in your web browser will allow
you to access the IPython Notebook server I have running on Google
Compute Engine, which can be seen in the following screenshot that was
available after entering the password used to start the server.
You can use this Notebook server in the same manner as you have been
using the course IPython Notebook server throughout class, only this
server is now running in the cloud. You can connect to the running
server via docker exec
or when needed shut the server down with
docker stop
. Finally, do note that leaving a virtual machine running
can consume resources; in the next section we review how to shut down
your virtual machine in order to reduce your resource consumption.
The last step should be to reclaim any resources you have running. If this virtual machine was running a server for a business, you likely would want to leave it running since this is now a publicly accesible site. However, for a personal server that is being used solely for development purposes it is best to shut the server down gracefully in order to prevent your limited resources from being exhausted (especially with a weak password that, in the case of this notebook, is also available online!).
To shut down the virtual machine, simply open the Google Compute Engine developer console, select your project from the list displayed, and click on VM instances in the left hand menu. Select the running rppds-15 virtual machine at the bottom of the page and next click on Stop as shown below:
You will be given a prompt asking you to confirm this decision to stop the virtual machine, and after selecting Stop your virtual machine, and running IPython Notebook server will no longer be running.
Once a virtual machine has been stopped, you can restart it by clicking
the Start button (you may have to first select the appropriate virtual
machine if it is not already selected). The IPython Notebook server will
not restart automatically (we did not set it up to do so, although we
could). Thus you will need to ssh
into the Google Compute Engine
virtual machine and issue the docker run
command to start the IPython
Notebook server.
You also can delete a virtual machine, but only do that when you know you will no longer need the virtual machine.