Cloud Datalab provides an environment for working with your data. This includes data that is being managed within the Stackdriver Monitoring API. This notebook introduces some of the APIs that Cloud Datalab provides for working with the monitoring data, and allows you to try them out on your own project.
The main focus of this API is to allow you to query time series data for your monitored resources. The time series, and it's metadata are returned as pandas DataFrame objects. pandas
is a widely used library for data manipulation, and is well suited to working with time series data.
Note: This notebook will show you how to use this API with your own project. The charts included here are from a sample project that you will not have access to. For all cells to run without errors, the following must hold:
The Monitoring functionality is contained within the datalab.stackdriver.monitoring
module.
If the default project is not already set via the environment variable $PROJECT_ID
, you must do so using 'set_datalab_project_id'
, or using the %datalab config
magic.
In [26]:
# set_datalab_project_id('my-project-id')
First, list supported options on the Stackdriver magic %sd
:
In [27]:
%sd -h
Let's see what we can do with the monitoring command:
In [28]:
%sd monitoring -h
Here we use IPython cell magics to list the CPU metrics. The Labels
column shows that instance_name
is a metric label.
In [29]:
%sd monitoring metrics list --type compute*/cpu/*
Out[29]:
In [30]:
%sd monitoring resource_types list --type gce*
Out[30]:
The Query
class allows users to query and access the monitoring time series data.
Many useful methods of the Query
class are actually defined by the base class, which is provided by the google-cloud-python
library. These methods include:
select_metrics
: filters the query based on metric labels.select_resources
: filters the query based on resource type and labels.align
: aligns the query along the specified time intervals.reduce
: applies aggregation to the query.as_dataframe
: returns the time series data as a pandas
DataFrame object.Reference documentation for the Query
base class is available here. You can also get help from inside the notebook by calling the help
function on any class, object or method.
In [31]:
from google.datalab.stackdriver import monitoring as gcm
help(gcm.Query.select_interval)
During intialization, the metric type and the time interval need to be specified. For interactive use, the metric type has a default value. The simplest way to specify the time interval that ends now
is to use the arguments days
, hours
, and minutes
.
In the cell below, we initialize the query to load the time series for CPU Utilization
for the last two hours.
In [32]:
query_cpu = gcm.Query('compute.googleapis.com/instance/cpu/utilization', hours=2)
The method metadata()
returns a QueryMetadata
object. It contains the following information about the time series matching the query:
This helps you understand the structure of the time series data, and makes it easier to modify the query.
In [33]:
metadata_cpu = query_cpu.metadata().as_dataframe()
metadata_cpu.head(5)
Out[33]:
In [34]:
import sys
if metadata_cpu.empty:
sys.stderr.write('This project has no GCE instances. The remaining notebook '
'will raise errors!')
else:
instance_names = sorted(list(metadata_cpu['metric.labels']['instance_name']))
print('First 5 instance names: %s' % ([str(name) for name in instance_names[:5]],))
In [35]:
query_cpu_single_instance = query_cpu.select_metrics(instance_name=instance_names[0])
# Get the query results as a pandas DataFrame and look at the last 5 rows.
data_single_instance = query_cpu_single_instance.as_dataframe(label='instance_name')
data_single_instance.tail(5)
Out[35]:
We can plot the time series data by calling the plot method of the dataframe. The pandas
library uses matplotlib
for plotting, so you can learn more about it here.
In [36]:
# N.B. A useful trick is to assign the return value of plot to _
# so that you don't get text printed before the plot itself.
_ = data_single_instance.plot()
You can aggregate or summarize time series data along various dimensions.
Not all alignment and reduction options are applicable to all time series, depending on their metric type and value type. Alignment and reduction may change the metric type or value type of a time series.
For multiple time series, aligning the data is recommended. Aligned data is more compact to read from the Monitoring API, and lends itself better to visualizations.
The alignment period can be specified using the arguments hours
, minutes
, and seconds
. In the cell below, we do the following:
'ALIGN_MEAN'
method.
In [37]:
# Filter the query by a common instance name prefix.
common_prefix = instance_names[0].split('-')[0]
query_cpu_aligned = query_cpu.select_metrics(instance_name_prefix=common_prefix)
# Align the query to have data every 5 minutes.
query_cpu_aligned = query_cpu_aligned.align(gcm.Aligner.ALIGN_MEAN, minutes=5)
data_multiple_instances = query_cpu_aligned.as_dataframe(label='instance_name')
# Display the data as a linechart, and move the legend to the right of it.
_ = data_multiple_instances.plot().legend(loc="upper left", bbox_to_anchor=(1,1))
In [38]:
query_cpu_reduced = query_cpu_aligned.reduce(gcm.Reducer.REDUCE_MEAN, 'resource.zone')
data_per_zone = query_cpu_reduced.as_dataframe('zone')
data_per_zone.tail(5)
Out[38]:
In [39]:
import matplotlib
import seaborn
# Set the size of the heatmap to have a better aspect ratio.
div_ratio = 1 if len(data_multiple_instances.columns) == 1 else 2.0
width, height = (size/div_ratio for size in data_multiple_instances.shape)
matplotlib.pyplot.figure(figsize=(width, height))
# Display the data as a heatmap. The timestamps are converted to strings
# for better readbility.
_ = seaborn.heatmap(data_multiple_instances.T,
xticklabels=data_multiple_instances.index.map(str),
cmap='YlGnBu')
In [40]:
data_multi_level = query_cpu_aligned.as_dataframe()
data_multi_level.tail(5)
Out[40]:
In [41]:
print('Finding pattern "%s" in the dataframe headers' % (common_prefix,))
In [42]:
data_multi_level.filter(regex=common_prefix).tail(5)
Out[42]:
In [43]:
data_multi_level.groupby(level='zone', axis=1).mean().tail(5)
Out[43]: