In PVQ, codewords aren't taken from a codebook, they must follow a specific rule, their sum of absolute difference or L1-Norm must equal K
$$\mathbf{y} \in \mathbb{Z}^N : \sum_{i=0}^{N-1} \lvert \mathbf{y}_i \rvert = K\:.$$As such, finding the codeword corresponding to a given vector is not trivial.
In this notebook, we will look into approximating pulses to see how accurately we can estimate how the energy is distributed from the left and right sides of the codeword. More precisely, we are interested in the absolute sum of the left half $K_{\text{Left}}$ and the absolute sum of the right half $K_{\text{Right}}$.
In [1]:
%matplotlib inline
import csv
import numpy as np
import matplotlib.pyplot as plt
We extracted the vector coefficient right before PVQ and their corresponding Codewords in seqF10_coeffs4x4.csv and seqF10_pulses4x4.csv respectively. For now we only consider the first 15 AC coefficients.
We gathered data from the first 10 frames of the sequences of the video-1-short dataset
In [2]:
def LoadData(coeffFile, pulseFile, kFile):
_coeffs = np.genfromtxt(coeffFile, delimiter=',')
_coeffs = _coeffs[:,0:15]
_pulses = np.genfromtxt(pulseFile, delimiter=',')
_pulses = _pulses[:,0:15]
_ks = np.genfromtxt(kFile, delimiter=',')
_numLines = len(_ks)
# Sanity check that the pulses match with k
for i in range(0, _numLines):
assert(_ks[i] == sum(abs(_pulses[i,:])))
return _coeffs, _pulses, _ks
[akiyoCoeffs, akiyoPulses, akiyoKs] = LoadData("data/akiyoF10_coeffs4x4.csv", "data/akiyoF10_pulses4x4.csv",
"data/akiyoF10_k4x4.csv");
[ducksCoeffs, ducksPulses, ducksKs] = LoadData("data/ducksF10_coeffs4x4.csv", "data/ducksF10_pulses4x4.csv",
"data/ducksF10_k4x4.csv");
[grandmaCoeffs, grandmaPulses, grandmaKs] = LoadData("data/grandmaF10_coeffs4x4.csv", "data/grandmaF10_pulses4x4.csv",
"data/grandmaF10_k4x4.csv");
[parkCoeffs, parkPulses, parkKs] = LoadData("data/parkF10_coeffs4x4.csv", "data/parkF10_pulses4x4.csv",
"data/parkF10_k4x4.csv");
[signCoeffs, signPulses, signKs] = LoadData("data/signF10_coeffs4x4.csv", "data/signF10_pulses4x4.csv",
"data/signF10_k4x4.csv");
[sintelCoeffs, sintelPulses, sintelKs] = LoadData("data/sintelF10_coeffs4x4.csv", "data/sintelF10_pulses4x4.csv",
"data/sintelF10_k4x4.csv");
[soccerCoeffs, soccerPulses, soccerKs] = LoadData("data/soccerF10_coeffs4x4.csv", "data/soccerF10_pulses4x4.csv",
"data/soccerF10_k4x4.csv");
In [4]:
def gain(v):
return np.sqrt(np.dot(v,v))
def shape(v):
return np.true_divide(v, gain(v))
def approximatePulses(_coeffs, _ks):
_approxPulses = np.zeros(shape=_coeffs.shape)
_roundedPulses = np.zeros(shape=_coeffs.shape)
for i in range(0,len(_ks)):
s = shape(_coeffs[i,:])
_approxPulses[i] = np.true_divide(s,sum(np.abs(s)))*_ks[i]
# Sanity check that the approximated pulses match with k
_numLines = len(_ks)
for i in range(0, _numLines):
assert(_ks[i] - sum(abs(_approxPulses[i,:])) < 0.0001)
return _approxPulses
akiyoApproxPulses = np.round(approximatePulses(akiyoCoeffs, akiyoKs))
ducksApproxPulses = np.round(approximatePulses(ducksCoeffs, ducksKs))
grandmaApproxPulses = np.round(approximatePulses(grandmaCoeffs, grandmaKs))
parkApproxPulses = np.round(approximatePulses(parkCoeffs, parkKs))
signApproxPulses = np.round(approximatePulses(signCoeffs, signKs))
sintelApproxPulses = np.round(approximatePulses(sintelCoeffs, sintelKs))
soccerApproxPulses = np.round(approximatePulses(soccerCoeffs, soccerKs))
Note that with this approach we have fractional pulses. The absolute sum of these fractional pulses is K. However, if we round, floor or ceil the pulses, the absolute sum may no longer be K.
Based on this original estimate, the encoder will try to play with the pulses to obtain integer pulses and to improve the rate distortion ratio of the codeword.
Let's measure the mean absolute error between our approximated pulses and the pulses of the Daala encoder.
In [5]:
akiyoError = np.abs(akiyoPulses - akiyoApproxPulses).sum(axis=1)
ducksError = np.abs(ducksPulses - ducksApproxPulses).sum(axis=1)
grandmaError = np.abs(grandmaPulses - grandmaApproxPulses).sum(axis=1)
parkError = np.abs(parkPulses - parkApproxPulses).sum(axis=1)
signError = np.abs(signPulses - signApproxPulses).sum(axis=1)
sintelError = np.abs(sintelPulses - sintelApproxPulses).sum(axis=1)
soccerError = np.abs(soccerPulses - soccerApproxPulses).sum(axis=1)
print("Average Absolute Approximation Error (in pulses)")
print("Akiyo: %0.2f" % np.mean(akiyoError))
print("Ducks: %0.2f" % np.mean(ducksError))
print("Grandma: %0.2f" % np.mean(grandmaError))
print("Park: %0.2f" % np.mean(parkError))
print("Sign: %0.2f" % np.mean(signError))
print("Sintel: %0.2f" % np.mean(sintelError))
print("Soccer: %0.2f" % np.mean(soccerError))
In [6]:
plt.figure(figsize=(16,20));
plt.subplot(4,2,1);
p = plt.stem(np.abs(akiyoPulses - akiyoApproxPulses).mean(axis=0));
plt.ylabel("Mean Absolute Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Akiyo");
plt.gca().set_xlim([-0.5, 14.5])
plt.subplot(4,2,2);
plt.stem(np.abs(ducksPulses - ducksApproxPulses).mean(axis=0));
plt.ylabel("Mean Absolute Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Ducks");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,3);
plt.stem(np.abs(grandmaPulses - grandmaApproxPulses).mean(axis=0));
plt.ylabel("Mean Absolute Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Grandma");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,4);
plt.stem(np.abs(parkPulses - parkApproxPulses).mean(axis=0));
plt.ylabel("Mean Absolute Error(in pulses)")
plt.xlabel("Codeword index");
plt.title("Park");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,5)
plt.stem(np.abs(signPulses - signApproxPulses).mean(axis=0));
plt.ylabel("Mean Absolute Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Sign");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,6);
plt.stem(np.abs(sintelPulses - sintelApproxPulses).mean(axis=0));
plt.ylabel("Mean Absolute Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Sintel");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,7);
plt.stem(np.abs(soccerPulses - soccerApproxPulses).mean(axis=0));
plt.ylabel("Mean Absolute Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Soccer");
plt.gca().set_xlim([-0.5, 14.5]);
In [7]:
plt.figure(figsize=(16,20));
plt.subplot(4,2,1);
p = plt.stem((akiyoPulses - akiyoApproxPulses).mean(axis=0));
plt.ylabel("Mean Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Akiyo");
plt.gca().set_xlim([-0.5, 14.5])
plt.subplot(4,2,2);
plt.stem((ducksPulses - ducksApproxPulses).mean(axis=0));
plt.ylabel("Mean Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Ducks");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,3);
plt.stem((grandmaPulses - grandmaApproxPulses).mean(axis=0));
plt.ylabel("Mean Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Grandma");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,4);
plt.stem((parkPulses - parkApproxPulses).mean(axis=0));
plt.ylabel("Mean Error(in pulses)")
plt.xlabel("Codeword index");
plt.title("Park");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,5)
plt.stem((signPulses - signApproxPulses).mean(axis=0));
plt.ylabel("Mean Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Sign");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,6);
plt.stem((sintelPulses - sintelApproxPulses).mean(axis=0));
plt.ylabel("Mean Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Sintel");
plt.gca().set_xlim([-0.5, 14.5]);
plt.subplot(4,2,7);
plt.stem((soccerPulses - soccerApproxPulses).mean(axis=0));
plt.ylabel("Mean Error(in pulses)");
plt.xlabel("Codeword index");
plt.title("Soccer");
plt.gca().set_xlim([-0.5, 14.5]);
In [8]:
akiyoLeftError = np.abs(akiyoPulses[:,0:7]).sum(axis=1) - np.abs(akiyoApproxPulses[:,0:7]).sum(axis=1)
ducksLeftError = np.abs(ducksPulses[:,0:7]).sum(axis=1) - np.abs(ducksApproxPulses[:,0:7]).sum(axis=1)
grandmaLeftError = np.abs(grandmaPulses[:,0:7]).sum(axis=1) - np.abs(grandmaApproxPulses[:,0:7]).sum(axis=1)
parkLeftError = np.abs(parkPulses[:,0:7]).sum(axis=1) - np.abs(parkApproxPulses[:,0:7]).sum(axis=1)
signLeftError = np.abs(signPulses[:,0:7]).sum(axis=1) - np.abs(signApproxPulses[:,0:7]).sum(axis=1)
sintelLeftError = np.abs(sintelPulses[:,0:7]).sum(axis=1) - np.abs(sintelApproxPulses[:,0:7]).sum(axis=1)
soccerLeftError = np.abs(soccerPulses[:,0:7]).sum(axis=1) - np.abs(soccerApproxPulses[:,0:7]).sum(axis=1)
print("Average Absolute Approximation Error Left (in pulses)")
print("Akiyo: %0.2f" % np.mean(np.abs(akiyoLeftError)))
print("Ducks: %0.2f" % np.mean(np.abs(ducksLeftError)))
print("Grandma: %0.2f" % np.mean(np.abs(grandmaLeftError)))
print("Park: %0.2f" % np.mean(np.abs(parkLeftError)))
print("Sign: %0.2f" % np.mean(np.abs(signLeftError)))
print("Sintel: %0.2f" % np.mean(np.abs(sintelLeftError)))
print("Soccer: %0.2f" % np.mean(np.abs(soccerLeftError)))
In [9]:
akiyoRightError = np.abs(akiyoPulses[:,7:15]).sum(axis=1) - np.abs(akiyoApproxPulses[:,7:15]).sum(axis=1)
ducksRightError = np.abs(ducksPulses[:,7:15]).sum(axis=1) - np.abs(ducksApproxPulses[:,7:15]).sum(axis=1)
grandmaRightError = np.abs(grandmaPulses[:,7:15]).sum(axis=1) - np.abs(grandmaApproxPulses[:,7:15]).sum(axis=1)
parkRightError = np.abs(parkPulses[:,7:15]).sum(axis=1) - np.abs(parkApproxPulses[:,7:15]).sum(axis=1)
signRightError = np.abs(signPulses[:,7:15]).sum(axis=1) - np.abs(signApproxPulses[:,7:15]).sum(axis=1)
sintelRightError = np.abs(sintelPulses[:,7:15]).sum(axis=1) - np.abs(sintelApproxPulses[:,7:15]).sum(axis=1)
soccerRightError = np.abs(soccerPulses[:,7:15]).sum(axis=1) - np.abs(soccerApproxPulses[:,7:15]).sum(axis=1)
print("Average Absolute Approximation Error Right (in pulses)")
print("Akiyo: %0.2f" % np.mean(np.abs(akiyoRightError)))
print("Ducks: %0.2f" % np.mean(np.abs(ducksRightError)))
print("Grandma: %0.2f" % np.mean(np.abs(grandmaRightError)))
print("Park: %0.2f" % np.mean(np.abs(parkRightError)))
print("Sign: %0.2f" % np.mean(np.abs(signRightError)))
print("Sintel: %0.2f" % np.mean(np.abs(sintelRightError)))
print("Soccer: %0.2f" % np.mean(np.abs(soccerRightError)))
It appears we are much better at predicting $K_{\text{Right}}$ than we are for $K_{\text{Left}}$.
Turns out that predicting the left half is tricky, because the codewords vary from our estimates based on elements specific to the content. However, the right half is much more stable, and since we know K, we can use the complement to estimate $K_{\text{left}}$
In [10]:
akiyoLeftErrorC = np.abs(akiyoPulses[:,0:7]).sum(axis=1) - (akiyoKs - np.abs(akiyoApproxPulses[:,7:15]).sum(axis=1))
ducksLeftErrorC = np.abs(ducksPulses[:,0:7]).sum(axis=1) - (ducksKs - np.abs(ducksApproxPulses[:,7:15]).sum(axis=1))
grandmaLeftErrorC = np.abs(grandmaPulses[:,0:7]).sum(axis=1) - (grandmaKs - np.abs(grandmaApproxPulses[:,7:15]).sum(axis=1))
parkLeftErrorC = np.abs(parkPulses[:,0:7]).sum(axis=1) - (parkKs - np.abs(parkApproxPulses[:,7:15]).sum(axis=1))
signLeftErrorC = np.abs(signPulses[:,0:7]).sum(axis=1) - (signKs - np.abs(signApproxPulses[:,7:15]).sum(axis=1))
sintelLeftErrorC = np.abs(sintelPulses[:,0:7]).sum(axis=1) - (sintelKs - np.abs(sintelApproxPulses[:,7:15]).sum(axis=1))
soccerLeftErrorC = np.abs(soccerPulses[:,0:7]).sum(axis=1) - (soccerKs - np.abs(soccerApproxPulses[:,7:15]).sum(axis=1))
print("Average Absolute Approximation Error Left (in pulses)")
print("Akiyo: %0.2f" % np.mean(np.abs(akiyoLeftErrorC)))
print("Ducks: %0.2f" % np.mean(np.abs(ducksLeftErrorC)))
print("Grandma: %0.2f" % np.mean(np.abs(grandmaLeftErrorC)))
print("Park: %0.2f" % np.mean(np.abs(parkLeftErrorC)))
print("Sign: %0.2f" % np.mean(np.abs(signLeftErrorC)))
print("Sintel: %0.2f" % np.mean(np.abs(sintelLeftErrorC)))
print("Soccer: %0.2f" % np.mean(np.abs(soccerLeftErrorC)))
Knowing that our predictions for the right half are better, its statistically correct to trust them more. As such, we discard our statistically imprecise estimate of KLeft and use the complement as shown in the example. It is not perfect but it is more precise. In this extreme example, we go from an error of -3 to +2. In most cases however, the error goes from 1 to 0.
In [13]:
idx = 5833;
print("Akiyo pulses: %s" % akiyoPulses[idx,:])
KLeft = np.abs(akiyoPulses[idx,0:7]).sum()
KRight = np.abs(akiyoPulses[idx,7:15]).sum()
print("KLeft = %d KRight = %d " % (KLeft, KRight))
print("Akiyo pulses approximation: %s" % akiyoApproxPulses[idx,:])
K = np.abs(akiyoPulses[idx,:]).sum()
KLeftApprox = np.abs(akiyoApproxPulses[idx, 0:7]).sum()
KRightApprox = np.abs(akiyoApproxPulses[idx, 7:15]).sum()
print("KLeftApprox = %d (%d) KRightApprox = %d " % (KLeftApprox, KLeftApprox - KLeft, KRightApprox))
print("KLeftComplement (%d - %d) = %d (%+d)" % (K, KRightApprox, K-KRightApprox, K-KRightApprox-KLeft))
In [14]:
plt.figure(figsize=(16,20));
plt.subplot(4,2,1);
plt.hist([akiyoLeftError, akiyoLeftErrorC]);
plt.xlabel("Approx Error (in pulses)");
plt.ylabel("Occurences");
plt.title("Akiyo")
plt.legend(["Left Error","Left Complement Error"])
plt.subplot(4,2,2);
plt.hist([ducksLeftError, ducksLeftErrorC]);
plt.xlabel("Approx Error (in pulses)");
plt.ylabel("Occurences");
plt.title("Ducks")
plt.legend(["Left Error","Left Complement Error"])
plt.subplot(4,2,3);
plt.hist([grandmaLeftError, grandmaLeftErrorC]);
plt.xlabel("Approx Error (in pulses)");
plt.ylabel("Occurences");
plt.title("Grandma")
plt.legend(["Left Error","Left Complement Error"])
plt.subplot(4,2,4);
plt.hist([parkLeftError, parkLeftErrorC]);
plt.xlabel("Approx Error (in pulses)");
plt.ylabel("Occurences");
plt.title("Park")
plt.legend(["Left Error","Left Complement Error"])
plt.subplot(4,2,5);
plt.hist([signLeftError, signLeftErrorC]);
plt.xlabel("Approx Error (in pulses)");
plt.ylabel("Occurences");
plt.title("Sign")
plt.legend(["Left Error","Left Complement Error"])
plt.subplot(4,2,6);
plt.hist([sintelLeftError, sintelLeftErrorC]);
plt.xlabel("Approx Error (in pulses)");
plt.ylabel("Occurences");
plt.title("Sintel")
plt.legend(["Left Error","Left Complement Error"])
plt.subplot(4,2,7);
plt.hist([soccerLeftError, soccerLeftErrorC]);
plt.xlabel("Approx Error (in pulses)");
plt.ylabel("Occurences");
plt.title("Soccer")
plt.legend(["Left Error","Left Complement Error"]);
In [15]:
print("Percentage of time KLeft estimate +-x is not optimal for x=0, x=1, x=2")
def checkCorrect(absErr):
numElements = absErr.shape
return (np.array([(absErr != 0).sum(), (absErr > 1).sum(), (absErr > 2).sum()]) / numElements * 100)
print("AKiyo: %s" % (checkCorrect(np.abs(akiyoLeftErrorC))))
print("Ducks: %s" % (checkCorrect(np.abs(ducksLeftErrorC))))
print("Grandma: %s" % (checkCorrect(np.abs(grandmaLeftErrorC))))
print("Park: %s" % (checkCorrect(np.abs(parkLeftErrorC))))
print("Sign: %s" % (checkCorrect(np.abs(signLeftErrorC))))
print("Sintel: %s" % (checkCorrect(np.abs(sintelLeftErrorC))))
print("Soccer: %s" % (checkCorrect(np.abs(soccerLeftErrorC))))
In [336]:
print("Percentage of time KRight estimate +-x is not optimal for x=0, x=1, x=2")
def checkCorrect(absErr):
numElements = absErr.shape
return (np.array([(absErr != 0).sum(), (absErr > 1).sum(), (absErr > 2).sum()]) / numElements * 100)
print("AKiyo: %s" % (checkCorrect(np.abs(akiyoRightError))))
print("Ducks: %s" % (checkCorrect(np.abs(ducksRightError))))
print("Grandma: %s" % (checkCorrect(np.abs(grandmaRightError))))
print("Park: %s" % (checkCorrect(np.abs(parkRightError))))
print("Sign: %s" % (checkCorrect(np.abs(signRightError))))
print("Sintel: %s" % (checkCorrect(np.abs(sintelRightError))))
print("Soccer: %s" % (checkCorrect(np.abs(soccerRightError))))
In this notebook, we looked at the accuracy of estimating the energy of the pulses based on the energy of the coefficients. From our dataset, we concluded, that the left half of the coefficient seem to be the hardest to estimate and that by estimating the right half and using the complement we obtained better results.
In [ ]: