Using ZigZag

ZigZag is a (very) small library I wrote for calculating the peaks and valleys of a sequence (e.g. time series data). It also can calculate the maximum drawdown), a useful metric for trading analysis. The repository is on github at https://github.com/jbn/ZigZag. Prior to version 0.1.4 it optionally used numba; starting with version 0.1.4, I switched to Cython.

This notebook demonstrates how to use ZigZag, and draws attention to a few caveats.

Installation

pip install zigzag

Basic Usage



In [1]:

    
%matplotlib inline

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from zigzag import *



In [2]:

    
# This is not nessessary to use zigzag. It's only here so that
# this example is reproducible.
np.random.seed(1997)



In [3]:

    
X = np.cumprod(1 + np.random.randn(100) * 0.01)
pivots = peak_valley_pivots(X, 0.03, -0.03)



In [4]:

    
def plot_pivots(X, pivots):
    plt.xlim(0, len(X))
    plt.ylim(X.min()*0.99, X.max()*1.01)
    plt.plot(np.arange(len(X)), X, 'k:', alpha=0.5)
    plt.plot(np.arange(len(X))[pivots != 0], X[pivots != 0], 'k-')
    plt.scatter(np.arange(len(X))[pivots == 1], X[pivots == 1], color='g')
    plt.scatter(np.arange(len(X))[pivots == -1], X[pivots == -1], color='r')

The following plot illustrates how the sequence was annotated.



In [5]:

    
plot_pivots(X, pivots)

The following shows how you can use pivots_to_modes to inspect the segments.



In [6]:

    
modes = pivots_to_modes(pivots)
pd.Series(X).pct_change().groupby(modes).describe().unstack()

Calculate the peak to valley returns for all of the segments.



In [7]:

    
compute_segment_returns(X, pivots)









    Out[7]:





array([ 0.09370263, -0.05981991,  0.07204542, -0.03419711,  0.04289563,
       -0.04197655,  0.03001853, -0.05506552,  0.07707074, -0.016124  ])

Finally, compute the oft-quoted (in financial literature) max_drawdown.



In [8]:

    
max_drawdown(X)









    Out[8]:





0.06755575755355037

Pandas Compatability

The peak_valley_pivots function works on pandas series assuming the index is either a DateTimeIndex or is [0, n). Pandas is great.



In [9]:

    
from pandas_datareader import get_data_yahoo

X = get_data_yahoo('GOOG')['Adj Close']
pivots = peak_valley_pivots(X.values, 0.2, -0.2)
ts_pivots = pd.Series(X, index=X.index)
ts_pivots = ts_pivots[pivots != 0]
X.plot()
ts_pivots.plot(style='g-o');

	count	mean	std	min	25%	50%	75%	max
-1	43.0	-0.004875	0.009995	-0.025602	-0.011249	-0.005225	0.000075	0.017768
1	56.0	0.005506	0.009663	-0.018131	0.000144	0.004643	0.010315	0.028133