In [ ]:
import pandas as pd
import holoviews as hv
from holoviews import opts
from bokeh.sampledata import stocks
from holoviews.operation.timeseries import rolling, rolling_outlier_std
from holoviews.streams import Stream
hv.extension('bokeh')
opts.defaults(opts.Curve(width=600, framewise=True))
In the previous guides we discovered how to load and declare dynamic, live data and how to transform elements using operations. In this guide we will discover how to combine dynamic data with operations to declare lazy and declarative data processing pipelines, which can be used for interactive exploration but can also drive complex dashboards or even bokeh apps.
We will begin by declaring a function which loads some data. In this case we will just load some stock data from the bokeh but you could imagine querying this data using REST interface or some other API or even loading some large collection of data from disk or generating the data from some simulation or data processing job.
In [ ]:
def load_symbol(symbol, **kwargs):
df = pd.DataFrame(getattr(stocks, symbol))
df['date'] = df.date.astype('datetime64[ns]')
return hv.Curve(df, ('date', 'Date'), ('adj_close', 'Adjusted Close'))
stock_symbols = ['AAPL', 'FB', 'GOOG', 'IBM', 'MSFT']
dmap = hv.DynamicMap(load_symbol, kdims='Symbol').redim.values(Symbol=stock_symbols)
We begin by displaying our DynamicMap to see what we are dealing with. Recall that a DynamicMap
is only evaluated when you request the key so the load_symbol
function is only executed when first displaying the DynamicMap
and whenever we change the widget dropdown:
In [ ]:
dmap
It is very common to want to process some data, for this purpose HoloViews provides so-called Operations
, which are described in detail in the Transforming Elements. Operations
are simply parameterized functions, which take HoloViews objects as input, transform them in some way and then return the output.
In combination with Dimensioned Containers such as HoloMap
and GridSpace
they are a powerful way to explore how the parameters of your transform affect the data. We will start with a simple example. HoloViews provides a rolling
function which smoothes timeseries data with a rolling window. We will apply this operation with a rolling_window
of 30, i.e. roughly a month of our daily timeseries data:
In [ ]:
smoothed = rolling(dmap, rolling_window=30)
smoothed
As you can see the rolling
operation applies directly to our DynamicMap
, smoothing each Curve
before it is displayed. Applying an operation to a DynamicMap
keeps the data as a DynamicMap
, this means the operation is also applied lazily whenever we display or select a different symbol in the dropdown widget.
In the previous section we briefly mentioned that in addition to regular widgets DynamicMap
also supports streams, which allow us to define custom events our DynamicMap
should subscribe to. To learn more about streams see the Responding to Events. Here we will declare a stream that controls the rolling window:
In [ ]:
rolling_stream = Stream.define('rolling', rolling_window=5)
stream = rolling_stream()
Now we can define a function that both loads the symbol and applies the rolling
operation passing our rolling_window
parameter to the operation:
In [ ]:
def rolled_data(symbol, rolling_window, **kwargs):
curve = load_symbol(symbol)
return rolling(curve, rolling_window=rolling_window)
rolled_dmap = hv.DynamicMap(rolled_data, kdims='Symbol',
streams=[stream]).redim.values(Symbol=stock_symbols)
rolled_dmap
Since we have a handle on the Stream
we can now send events to it and watch the plot above update, let's start by setting the rolling_window=50
.
In [ ]:
stream.event(rolling_window=50)
Instead of manually defining a function we can also do something much simpler, namely we can just apply the rolling operation to the original DynamicMap
we defined and pass our rolling_stream
to the operation. To make things a bit more interesting we will also apply the rolling_outlier_std
function which computes outliers within the rolling_window
. We supply our stream to both:
In [ ]:
stream = rolling_stream()
smoothed = rolling(dmap, streams=[stream])
outliers = rolling_outlier_std(dmap, streams=[stream])
smoothed * outliers.opts(color='red', marker='triangle')
Since the rolling_stream
instance we created is bound to both operations, triggering an event on the stream will trigger both the Curve
and the Scatter
of outliers to be updated:
In [ ]:
stream.event(rolling_window=50)
We can chain operations like this indefinitely and attach streams to each stage. By chaining we can watch our visualization update whenever we change a stream value anywhere in the pipeline and HoloViews will be smart about which parts of the pipeline are recomputed, which allows us to build complex visualizations very quickly.
In later guides we will discover how to tie custom streams to custom widgets letting us easily control the stream values and making it trivial to define complex dashboards. paramNB
is only one widget framework we could use: we could also choose paramBokeh
to make use of bokeh widgets and deploy the dashboard on bokeh server, or we could manually link ipywidgets
to our streams. For more information on how to deploy bokeh apps from HoloViews and build dashboards see the Deploying Bokeh Apps and Dashboards guides.