Kevin J. Walchko, created 11 July 2017
We are not going to do augmented reality, but we are going to learn how the markers work and use it for robotics. This will also give you insite on how QR codes work. For our applications, we want to be able to mark real world objects and have our robot know what they are (think road signs or mile markers on the highway).
In [1]:
%matplotlib inline
In [9]:
from __future__ import print_function
from __future__ import division
import numpy as np
from matplotlib import pyplot as plt
import cv2
import time
# make sure you have installed the library with:
# pip install -U ar_markers
from ar_markers import detect_markers
There are lots of different types of markers out there in the world. Some are free to use and some are protected by intellectual property rights. Markers that machines can read range from simple bar codes on food products that can be scaned to much more complex 2D and 3D markers. We are going to look at a simple but useful type of 2D marker shown below.
The approach implemented here uses a type of Hamming code with the possibility to correct errors. This error correction is particularly useful when the marker is small or blurred in the image. Also, the idea is to be able to decipher the code provided by the marker without having to rotate it because there is a known pattern. Once that’s done, it becomes easy to use the black and white squares to read the signature and, if necessary, correct the code if an error is found.
First let's take a little side step and understand how a 7 bit hamming code works. In coding theory, Hamming(7,4) is a linear error-correcting code that encodes four bits of data into seven bits by adding three parity bits. It is a member of a larger family of Hamming codes, but the term Hamming code often refers to this specific code that Richard W. Hamming introduced in 1950. At the time, Hamming worked at Bell Telephone Laboratories and was frustrated with the error-prone punched card reader, which is why he started working on error-correcting codes.
The marker shown above is a 5x5 grid marker with a Hamming code [7,4] to detect and help correct errors. This form of hamming code uses 7 bits with 4 bits of data and 3 bits of parity. This code is capable of correcting 1 error or bit flip.
A nice graphical interpretation is shown above. The idea is the data bits $d_1, d_2, d_3, d_4$ are covered by multiple parity bits giving redundancy. For example $d_1$ is covered by $p_1$ and $p_2$ while $d_2$ is covered by $p_1$ and $p_3$. This redundancy allows the code to correct for 1 error. Error can come from many sources:
Given a 4 bit message (m) we can encode it using a code generator matrix ($G_{4 \times 7}$). We can also check the parity of an encoded message (e) using a parity check matrix ($H_{3 \times 7}$). To decode an encoded message we will use a regeneration matrix (R). Where:
$$ G = \begin{bmatrix} 1 & 1 & 0 & 1 \\ 1 & 0 & 1 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \\ H = \begin{bmatrix} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 0 & 1 & 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1 & 0 & 0 & 1 \end{bmatrix} \\ R = \begin{bmatrix} 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \end{bmatrix} \\ e = G \times m = \begin{bmatrix} d_1 & d_2 & d_3 & d_4 & p_1 & p_2 & p_3 \end{bmatrix} \\ \text{parity check} = H \times e \\ m = R \times e $$A good message has the parity check result in $\begin{bmatrix} 0 & 0 & 0 \end{bmatrix}$. If an error is present, the parity bits form a binary number which tells you which of the 7 bits is flipped. Again, this can only handle 1 error and correct.
Once the marker’s borders are found in the image, we are looking at four specific squares place at the corners of our 5×5 pattern (see the picture). These registration marks tell us where the data and parity bits are in the 5x5 array.
Once the orientation is decided we can construct the signature. In the 5×5 case it’s straightforward to read 3 signatures that contains 7 bits. Then for each signature:
Finally, using the bits of data contained in the 3 signatures, compute the code that corresponds to this binary vector.
Once errors are checked and corrected, the 3 signatures (green, red and blue areas) are used to generate the binary code to decipher (12 bits aligned at the bottom). So our marker has 5 x 5 bits (black or white squares) which give us:
Thus we have a marker than can have a $2^{12}$ bit number with a value between 0 - 4095.
Here is what we are going to do:
ar_markers
detection function
In [3]:
def fix_binary(msg):
# now, use the modulas operator to ensure it is a binary number
ans = []
for val in msg:
ans.append(val%2)
return np.array(ans)
# encode a message
G = np.array([
[1,1,0,1],
[1,0,1,1],
[1,0,0,0],
[0,1,1,1],
[0,1,0,0],
[0,0,1,0],
[0,0,0,1]
])
# decode and encoded message
R = np.array([
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 1],
])
# check parity
H = np.array([
[1, 0, 1, 0, 1, 0, 1],
[0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 1, 1, 1, 1],
])
# a 4 bit message we want to send
msg = np.array([1,0,1,1])
In [4]:
e = fix_binary(G.dot(msg))
print('encoded msg:', e)
parity_check = fix_binary(H.dot(e))
print('parity check:', parity_check)
decoded_msg = R.dot(e)
print('decoded message:', decoded_msg)
print('Does msg == decoded msg?', msg == decoded_msg)
In [5]:
for i in range(7):
e = fix_binary(G.dot(msg))
print('Corrupt bit', i+1, '--------------------------------------')
e[i] = 0 if e[i] == 1 else 1
parity_check = fix_binary(H.dot(e))
print(' parity check:', parity_check)
decoded_msg = R.dot(e)
We are going to corrupt each bit one at a time and detect it ... we are not fixing it. Notice now, we can't identify the incorrect bit, but we know something bad is happening. Notice corrupted bit 1 and 6 give the same parity check: [1, 1, 0]. If you need protection from more than 1 error, then you need to select a different algorithm.
In [6]:
for i in range(7):
e = fix_binary(G.dot(msg))
e[i] = 0 if e[i] == 1 else 1
j = (i+1)%7
e[j] = 0 if e[j] == 1 else 1
print('Corrupt bit', i+1, 'and', j+1, '--------------------------------------')
parity_check = fix_binary(H.dot(e))
print(' parity check:', parity_check)
decoded_msg = R.dot(e)
In [7]:
img = cv2.imread('ar_marker_pics/flyer-hamming.png')
plt.imshow(img);
print('image dimensions [width, height, color depth]:', img.shape)
In [17]:
markers = detect_markers(img)
# let's print the id of the marker
print(markers[0].id)
print()
q
quits the programnp.rot90
), does the library still get the correct answer?
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.