Using Google Cloud Datalab - Accessing Cloud Data

This notebook describes how Google Cloud Datalab integrates within your Google Cloud project, and how you can work with data, manage your notebooks, and invoke APIs that are part of Google Cloud Platform.

An Under-the-Covers Look

Datalab functionality is packaged into a docker container. This container contains a ready-to-use environment including the Python runtime, a set of libraries picked for data analysis and visualization scenarios, Google Cloud Platform integration functionality, and this front-end server enabling this environment.

You can deploy one or more Datalab instances within your Google Cloud Platform project. Access to these instances is based on the IAM policies for your project. Note, however, that each instance is a single-user environment, and trying to share it can cause conflicts.

Within the instance, the Datalab frontend manages notebooks, notebook sessions, and the corresponding instances of IPython and Python runtime.

Google Cloud Integration


In [ ]:
from google.datalab import Context

context = Context.default()
print('The current project is %s' % context.project_id)

Datalab automatically handles authentication to detect the current project, as well as obtaining the OAuth token used to invoke APIs. In particular, it uses the OAuth token representing the project's service account, rather than an individual user's credentials.

Service Accounts

This is an important detail.

The code you author and the data you access is stored in notebooks that are shared across the project. As such, the authorization used to execute and retrieve that data is based upon the project.

Also, any applications or data pipelines you produce within Datalab are deployed using the project's service account, not individual accounts; this use of the project's service account is generally considered good practice.

Consequently, to access resources contained within another project, you will need to authorize the service account of your Datalab project within that other project, rather than authorize a particular user.


In [ ]:
!gcloud auth list

The above code prints out the list of active accounts.

This service account can also be seen by clicking on the account_circle button in the top-right corner of the Cloud Datalab UI.