So far we restricted ourselves to 1D time series, but the idea of recovering periodic dynamics with geometry can just as easily apply to multivariate signals. In this module, we will examine sliding windows of videos as an exmaple. Many natural videos also have periodicity, such as this video of a woman doing jumping jacks
In [ ]:
import io
import base64
from IPython.display import HTML
video = io.open('jumpingjacks.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
Video can be decomposed into a 3D array, which has dimensions width x height x time. To tease out periodicity in geometric form, we will do the exact same thing as with sliding window 1D signal embeddings, but instead of just one sample per time shift, we need to take every pixel in every frame in the time window. The figure below depicts this
To see this visually in the video next to PCA of the embedding, look at the following video
In [ ]:
video = io.open('jumpingjackssliding.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
WH
pixels in each frame, there are only N
frames in the video. This means that each frame in the video can be represented in an (N-1)
dimensional subspace of the pixel space, and the coordinates of this subspace can be used in lieu of the pixels in the sliding window embedding. This can be done efficiently with a PCA step before the sliding window embedding. Run the cell below to load code that does PCA efficiently
In [ ]:
#Do all of the imports and setup inline plotting
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from mpl_toolkits.mplot3d import Axes3D
import scipy.interpolate
from ripser import ripser
from persim import plot_diagrams
from VideoTools import *
##Here is the actual PCA code
def getPCAVideo(I):
ICov = I.dot(I.T)
[lam, V] = linalg.eigh(ICov)
V = V*np.sqrt(lam[None, :])
return V
dim
. We're also using linear interpolation instead of spline interpolation to keep things fast
In [ ]:
def getSlidingWindowVideo(I, dim, Tau, dT):
N = I.shape[0] #Number of frames
P = I.shape[1] #Number of pixels (possibly after PCA)
pix = np.arange(P)
NWindows = int(np.floor((N-dim*Tau)/dT))
X = np.zeros((NWindows, dim*P))
idx = np.arange(N)
for i in range(NWindows):
idxx = dT*i + Tau*np.arange(dim)
start = int(np.floor(idxx[0]))
end = int(np.ceil(idxx[-1]))+2
if end >= I.shape[0]:
X = X[0:i, :]
break
f = scipy.interpolate.interp2d(pix, idx[start:end+1], I[idx[start:end+1], :], kind='linear')
X[i, :] = f(pix, idxx).flatten()
return X
Finally, let's load in the jumping jacks video and perform PCA to reduce the number of effective pixels.
Note that loading the video may take a few seconds on the virtual image
In [ ]:
#Load in video and do PCA to compress dimension
(X, FrameDims) = loadImageIOVideo("jumpingjacks.ogg")
X = getPCAVideo(X)
Now let's do a sliding window embedding and examine the sliding window embedding using TDA. As before, you should tweak the parameters of the sliding window embedding and study the effect on the geometry.
In [ ]:
#Given that the period is 30 frames per cycle, choose a dimension and a Tau that capture
#this motion in the roundest possible way
#Plot persistence diagram and PCA
dim = 30
Tau = 1
dT = 1
#Get sliding window video
XS = getSlidingWindowVideo(X, dim, Tau, dT)
#Mean-center and normalize sliding window
XS = XS - np.mean(XS, 1)[:, None]
XS = XS/np.sqrt(np.sum(XS**2, 1))[:, None]
#Get persistence diagrams
dgms = ripser(XS)['dgms']
#Do PCA for visualization
pca = PCA(n_components = 3)
Y = pca.fit_transform(XS)
fig = plt.figure(figsize=(12, 6))
plt.subplot(121)
plot_diagrams(dgms)
plt.title("1D Persistence Diagram")
c = plt.get_cmap('nipy_spectral')
C = c(np.array(np.round(np.linspace(0, 255, Y.shape[0])), dtype=np.int32))
C = C[:, 0:3]
ax2 = fig.add_subplot(122, projection = '3d')
ax2.set_title("PCA of Sliding Window Embedding")
ax2.scatter(Y[:, 0], Y[:, 1], Y[:, 2], c=C)
ax2.set_aspect('equal', 'datalim')
plt.show()
We will now examine videos from the KTH dataset, which is a repository of black and white videos of human activities. It consists of 25 subjects performing 6 different actions in each of 4 scenarios. We will use the algorithms developed in this section to measure and rank the periodicity of the different video clips.
In [ ]:
video = io.open('KTH/handwaving/person01_handwaving_d1_uncomp.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
We have done some additional preprocessing, including applying a bandpass filter to each PCA pixel to cut down on drift in the video. Below we show a video varying the window size of the embedding and plotting the persistence diagram, "self-similarity matrix" (distance matrix), and PCA of the embedding, as well as an evolving plot of the maximum persistence versus window size:
In [ ]:
video = io.open('Handwaving_Deriv10_Block160_PCA10.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
As you can see, the maximum persistence peaks at around 40 frames, which is the period of each hand wave. This is what the theory we developed for 1D time series would have predicted as the roundest window.
We now examine how this pipeline can be used to detect quasiperiodicity in videos. As an example, we examine videos from high-speed glottography, or high speed videos (4000 fps) of the left and right vocal folds in the human vocal tract. When a person has a normal voice, the vocal folds oscillate in a periodic fashion. On the other hand, if they have certain types of paralysis or near chaotic dynamics, they can exhibit biphonation just as the horse whinnies did. More info can be found in this paper.
Let's begin by analyzing a video of a healthy person. In this example and in the following example, we will be computing both persistent H1 and persistent H2, so the code may take a bit longer to run.
In [ ]:
video = io.open('NormalPeriodicCrop.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
In [ ]:
(X, FrameDims) = loadVideo("NormalPeriodicCrop.ogg")
X = getPCAVideo(X)
dim = 70
Tau = 0.5
dT = 1
derivWin = 10
#Take a bandpass filter in time at each pixel to smooth out noise
[X, validIdx] = getTimeDerivative(X, derivWin)
#Do the sliding window
XS = getSlidingWindowVideo(X, dim, Tau, dT)
#Mean-center and normalize sliding window
XS = XS - np.mean(XS, 1)[:, None]
XS = XS/np.sqrt(np.sum(XS**2, 1))[:, None]
#Compute and plot persistence diagrams
print("Computing persistence diagrams...")
dgms = ripser(XS, maxdim=2)['dgms']
print("Finished computing persistence diagrams")
plt.figure()
plot_diagrams(dgms)
plt.title("Persistence Diagrams$")
plt.show()
Let's now examine a video of someone with a vocal pathology. This video may still appear periodic, but if you look closely there's a subtle shift going on over time
In [ ]:
video = io.open('ClinicalAsymmetry.mp4', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
In [ ]:
(X, FrameDims) = loadVideo("ClinicalAsymmetry.mp4")
X = getPCAVideo(X)
X = X[0:200, :]
#'dim':32, 'Tau':0.25, 'dT':0.25, 'derivWin':2
dim = 100
Tau = 0.25
dT = 0.5
derivWin = 5
#Take a bandpass filter in time at each pixel to smooth out noise
[X, validIdx] = getTimeDerivative(X, derivWin)
#Do the sliding window
XS = getSlidingWindowVideo(X, dim, Tau, dT)
print("XS.shape = ", XS.shape)
#Mean-center and normalize sliding window
XS = XS - np.mean(XS, 1)[:, None]
XS = XS/np.sqrt(np.sum(XS**2, 1))[:, None]
#Compute and plot persistence diagrams
print("Computing persistence diagrams...")
dgms = ripser(XS, maxdim=2)['dgms']
print("Finished computing persistence diagrams")
plt.figure()
plt.title("Persistence Diagrams$")
plot_diagrams(dgms)
plt.show()
Let's now examine another person with a vocal pathology, this time due to mucus that is pushed out of the vocal folds every other oscillation. This time, we will look at both $\mathbb{Z} / 2\mathbb{Z}$ coefficients and $\mathbb{Z} / 3 \mathbb{Z}$ coefficients.
In [ ]:
video = io.open('LTR_ED_MucusBiphonCrop.ogg', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video>'''.format(encoded.decode('ascii')))
In [ ]:
(X, FrameDims) = loadVideo("LTR_ED_MucusBiphonCrop.ogg")
X = getPCAVideo(X)
X = X[0:200, :]
#'dim':32, 'Tau':0.25, 'dT':0.25, 'derivWin':2
dim = 100
Tau = 1
dT = 0.25
derivWin = 5
#Take a bandpass filter in time at each pixel to smooth out noise
[X, validIdx] = getTimeDerivative(X, derivWin)
#Do the sliding window
XS = getSlidingWindowVideo(X, dim, Tau, dT)
print("XS.shape = ", XS.shape)
#Mean-center and normalize sliding window
XS = XS - np.mean(XS, 1)[:, None]
XS = XS/np.sqrt(np.sum(XS**2, 1))[:, None]
#Compute and plot persistence diagrams
print("Computing persistence diagrams...")
dgms2 = ripser(XS, maxdim=2, coeff=2)['dgms']
dgms3 = ripser(XS, maxdim=2, coeff=3)['dgms']
print("Finished computing persistence diagrams")
plt.figure(figsize=(8, 4))
plt.subplot(121)
plot_diagrams(dgms2)
plt.title("Persistence Diagrams $\mathbb{Z}2$")
plt.subplot(122)
plot_diagrams(dgms3)
plt.title("Persistence Diagrams $\mathbb{Z}3$")
plt.show()
In [ ]: