pydstk: The (Py)thon (D)namical (S)stems (T)ool(K)it

Author: Roland Kwitt, Kitware Inc., 2013 E-Mail: roland.kwitt@kitware.com, rkwitt@gmx.at

Funding

National Cancer Institute (NCI) of the National Institutes of Health (NIH) under award numbers 1R01CA138419, R01CA170665, R43CA165621, and R44CA143234.

Background

Dynamical systems are an invaluable concept in many areas of science, such as control theory or physics. Over the past decade, they have also found numerous applications in computer vision. Especially, with the advent of big video data (e.g., youtube, vimeo), people have looked into ways of describing the dynamics as well as the appearance of those visual processes by generative models. This is appealing, since it allows to synthesize observations based on the estimated parameters of the dynamical system.

References

Here are some important references to get started with (kernel) dyanmic textures:

@inproceedings{Soatto01a, authpr = {S. Soatto and G. Doretto and Y.N. Wu}, title = {Dynamic Textures}, booktitle = {ICCV}, year = {2001}}

@inproceedings{Chan07a, author = {A. Chan and N. Vasconcelos}, title = {Classifying Video with Kernel Dynamic Textures}, booktitle = {CVPR}, year = {2007}}

@article{Doretto03a, author = {Doretto, G. and Chiuso, A. and Wu, Y. and Soatto, S.}, title = {Dynamic textures}, journal = {International Journal of Computer Vision}, year = {2003}, volume = {51}, number = {2}, pages = {91-109}}

Dynamic texture models have recently been used in a variety of applications; here are just a few:

@inproceedings{Kwitt12d, author = {R. Kwitt and N. Vasconcelos and S. Razzaque and S. Alyward}, title = {Recognition in Ultrasound Videos: Where Am I?}, booktitle = {MICCAI}, year = {2012}}

@inproceedings{Chan05a, author = {A. B. Chan and N. Vasconcelos}, title = {Probabilistic Kernels for the Classification of Auto-Regressive Visual Processes}, booktitle = {CVPR}, year = {2005}}

Getting Started - Part I: Loading Sample Data

pydstk support a several of ways to load video material. All routines are implemented in the dsutil package. Let's import this first ...



In [39]:

    
import dsutil.dsutil as dsutil

First, we will try to load a video (traffic sequence) from an ASCII file that contains all pixel values for each frame:



In [41]:

    
data, sz = dsutil.loadDataFromASCIIFile("tests/data/data1.txt")



In [42]:

    
data.shape









    Out[42]:





(2304, 48)

Each of the 48 frames in the video of size (48 $\times$ 48) was vectorized into a 2304 dimensional column vector.



In [4]:

    
sz









    Out[4]:





[48, 48, 48]

To see what the video looks like, we take a look at the first frame of the video:



In [5]:

    
F = data[:,0].reshape((48,48))



In [6]:

    
imshow(F.T, cmap=cm.gray)









    Out[6]:





<matplotlib.image.AxesImage at 0x10c301e90>

Getting Started - Part II: System Identification

The (discrete-time) linear dynamical system (aka dynamic texture in a vision context) can be written as:

$x_{t+1} = \boldsymbol{A}\boldsymbol{x}_{t} + \boldsymbol{w}_t$

$y_{t} = \boldsymbol{C}\boldsymbol{x}_{t} + \boldsymbol{v}_t$

Both state- and observation-evolvement are linear and the noise process (i.e., $w_t\sim \mathcal{N}(0,\boldsymbol{R})$ and $v_t \sim \mathcal{N}(0,\boldsymbol{Q}$) are zero-mean Gaussian (with different covariances). The task of system identification is to estimate the model parameters $(\boldsymbol{A},\boldsymbol{C},\boldsymbol{R},\boldsymbol{Q},x_0)$ from a set of observations $[\boldsymbol{y}_0-\bar{y},\ldots,\boldsymbol{y}_N-\bar{y}],~\boldsymbol{y}_i \in \mathbb{R}^n$. In case of standard dynamic textures, the observations $\boldsymbol{y}_i$ are our video frames (and $\bar{y}$ is the mean, taken over all frames of the video).

In our example $\boldsymbol{y}_i \in \mathbb{R}^{48 \times 48}$ and $N=48$. Next, we will create a Linear Dynamical System (LDS) with 5 states:



In [43]:

    
from dscore.system import LinearDS



In [46]:

    
lds = LinearDS(5, False, False)

We can now run system identification to obtain the dynamic texture parameters. System identification is done via the suboptimal (SVD-based) approach that was originally proposed in [Soatto01a] (see References).



In [9]:

    
lds.check()









    Out[9]:





False



In [47]:

    
lds.suboptimalSysID(data)



In [48]:

    
lds.check()









    Out[48]:





True

As we have seen, check() returns False before we actually estimate the system parameters. This can be used to test if the dynamic texture model is ready for instance.



In [12]:

    
lds._Ahat.shape









    Out[12]:





(5, 5)

Let's take a look at the spectral radius of the state-matrix (which should be $<1$ if the system is stable):



In [57]:

    
import numpy as np
specRad = np.max(np.abs(np.linalg.eig(lds._Ahat)[0]))
print specRad

Note: Although THIS system is stable, system identification using SVD does NOT enforce that!



In [56]:

    
lds._Chat.shape









    Out[56]:





(2304, 5)

So, the state-transition matrix A is a (5 $\times$ 5) matrix, the observation matrix C is a (2304 $\times$ 48) matrix.

Getting Started - Part III: Measuring Model Similarity

We will use the Martin Distance here as an example to measure similarity between two dynamical systems. There are many other recent proposals on suitable similarity measures, though (e.g., Binet-Cauchy kernels, etc.). First, we load a second model to compare to (another traffic sequence):



In [18]:

    
refData, _ = data, sz = dsutil.loadDataFromASCIIFile("tests/data/data2.txt")



In [19]:

    
refLDS = LinearDS(5, False, False)



In [20]:

    
refLDS.suboptimalSysID(refData)

Distance measurements in pydstk are implemented the dscore/dsdist.py module from which we'll import ldsMartinDistance:



In [21]:

    
from dscore.dsdist import ldsMartinDistance



In [64]:

    
D = ldsMartinDistance(lds, refLDS, 20)

Computation of the Martin distance, in principle, would require to build the infinite observability matrix of the system, i.e., $[\boldsymbol{C}~\boldsymbol{C}\boldsymbol{A}~\boldsymbol{C}\boldsymbol{A}^2\cdots]^T$. The last argument to ldsMartinDistance, i.e., 20 specifies the maximum power to which $\boldsymbol{A}$ is raised.



In [65]:

    
print D









    



20.9015424417

When we compare lds to itself, we should get 0 (up to numerical acc.):



In [69]:

    
np.testing.assert_almost_equal(ldsMartinDistance(lds, lds, 20), 0, 5)

Getting Started - Part IV: Clustering Dynamic Texture Models

Clustering dynamic texture models is an important task when building Bag-of-DT models, or when computational resources are limited and model comparison needs to be done with respect to a limited set of representative models from a database.

First, we are going to import a list of data file names. These files contain 2-D signals (sine, cosine) with different frequencies, embedded in 24-dimensional space.

Let's take a look at one 2-D signal:



In [75]:

    
sig0 = np.genfromtxt("tests/data/clustering/sig0001.txt")



In [77]:

    
plot(sig0[0,:])
plot(sig0[1,:])









    Out[77]:





[<matplotlib.lines.Line2D at 0x10e0a7510>]



In [82]:

    
sig1 = np.genfromtxt("tests/data/clustering/sig0077.txt")



In [83]:

    
plot(sig1[0,:])
plot(sig1[1,:])









    Out[83]:





[<matplotlib.lines.Line2D at 0x10e4f6c90>]

Signals 0-49 were created using random frequencies in [0.9,1], signals 50-100 were created using random frequencies in [0.3, 0.4]. The signals are sampled at 80 linearly spaced points in [0,$6\pi$]. Some random noise ($\sigma=0.01$) was finally added to the embedded signal (in 24-dimensional space).



In [84]:

    
with open("tests/data/clustering.files") as fid: 
    content = fid.read().splitlines()



In [28]:

    
content[0:5]









    Out[28]:





['clustering/emb0001.txt',
 'clustering/emb0002.txt',
 'clustering/emb0003.txt',
 'clustering/emb0004.txt',
 'clustering/emb0005.txt']

Next, we estimate a 8-state LDS for the data contentof each file:



In [86]:

    
import os

models = []
for data_file in content:
    data = np.genfromtxt(os.path.join("tests/data/", data_file))
    
    lds = LinearDS(8, False, False)
    lds.suboptimalSysID(data)
    models.append(lds)

Let's create the pairwise distance matrix (between all the models) next:



In [87]:

    
N = len(models)
D = np.zeros((N, N))

for i in range(N):
    for j in range(N):
        D[i,j] = ldsMartinDistance(models[i], models[j], 50)

Let's take a look at the distance matrix:



In [88]:

    
imshow(D)









    Out[88]:





<matplotlib.image.AxesImage at 0x10e14bd50>

Looks not too bad. Let's run LDS clustering (with 2 cluster centers):



In [89]:

    
eData, ids = LinearDS.cluster(D, 2)



In [90]:

    
# first, we plot the data
plot(eData[0:49,0],eData[0:49,1], color='b', marker='+', linestyle='')
plot(eData[50:,0],eData[50:,1], color='r', marker='o', linestyle='')
# then, we plot the representatives
plot(eData[ids[0],0],eData[ids[0],1], marker='s', color='black', markersize=10)
plot(eData[ids[1],0],eData[ids[1],1], marker='s', color='green', markersize=10)









    Out[90]:





[<matplotlib.lines.Line2D at 0x10e16b610>]

As we can see in this toy example, the LDS's are well separated and the cluster centers seem reasonable.