The witnesses start and end with perfect matches (abcd and efgh, respectively). Witness A has one token in the middle (0123) and witness B has two (012x, 01xx) or three (012x, 01xx, 0xxx). The two or three candidates for alignment in witness B are all partial matches to the middle token in A, with different degrees of similarity. All permutations of the candidates in B are tested to determine whether A is aligned with the correct one.
In [63]:
%reload_ext autoreload
%autoreload 2
from collatex import *
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 01xx 012x efgh")
alignment_table = collate(collation, segmentation=False)
print(alignment_table)
In the example below, 0123 in A is closer to 012x (left) in B, and it correctly stays left.
In [64]:
# Two candidates
# With near matching, it goes to the closer match, whether that's left or right
# Closer match is left, no movement
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 012x 01xx efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
In the example below, 0123 in A is closer to 012x (right) in B, and it correctly moves right.
In [65]:
# Two candidates
# With near matching, it goes to the closer match, whether that's left or right
# Same input as above, but closer match is right, so moves
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 01xx 012x efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
In [66]:
# Three candidates, closest is left, match rank 0 1 2 (0 is closest)
# Should stay left; succeeds
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 012x 01xx 0xxx efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
In [67]:
# Three candidates, closest is left, match rank 0 2 1 (0 is closest)
# Should stay left; succeeds
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 012x 0xxx 01xx efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
In [68]:
# Three candidates, closest is right, match rank 1 2 0 (0 is closest)
# Should go right; succeeds
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 01xx 0xxx 012x efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
In [69]:
# Three candidates, closest is right, match rank 2 1 0 (0 is closest)
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 0xxx 01xx 012x efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
In [70]:
# Three candidates, closest is middle, match rank 1 0 2 (0 is closest)
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 01xx 012x 0xxx efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
In [71]:
# Three candidates, closest is middle, match rank 2 0 1 (0 is closest)
collation = Collation()
collation.add_plain_witness("A", "abcd 0123 efgh")
collation.add_plain_witness("B", "abcd 0xxx 012x 01xx efgh")
alignment_table = collate(collation, near_match=True, segmentation=False)
print(alignment_table)
We expect:
+---+------+--------+--------+--------+--------+--------+------+
| A | abcd | - | - | 012345 | - | | efgh |
| B | abcd | 0xxxxx | 01xxxx | 01234x | 012xxx | 0123xx | efgh |
| C | abcd | - | 01xxxx | - | - | zz23xx | efgh |
+---+------+--------+--------+--------+--------+--------+------+
In [72]:
%reload_ext autoreload
%autoreload 2
from collatex import *
collation = Collation()
collation.add_plain_witness("A", "abcd 012345 efgh")
collation.add_plain_witness("B", "abcd 0xxxxx 01xxxx 01234x 012xxx 0123xx efgh")
collation.add_plain_witness("C", "abcd 01xxxx zz23xx efgh")
alignment_table = collate(collation, segmentation=False, near_match=True)
print(alignment_table)