Collating for real with Collatex (2)

Here we can repeat the same steps done in the previous exercise, with a new and slightly more complicated text case. You can create a new notebook for this exercise and follow the instructions below.

We will be using different editions of Virginia Woolf's "To the lighthouse":

  • USA = New York: Harcourt, Brace & Company, 1927 (1st USA edition)
  • UK = Londond: R & R Clark Limited, 1827 (1st UK edition)
  • EM (EVERYMAN) = London: J. M. Dent & Sons LTD, 1938 (reprint 1952)

The facsimiles and trascriptions of the editions are available at http://woolfonline.com/

First exercise

Try to reproduce what you have done with the Darwin text.

Import the collatex Python library


In [ ]:
from collatex import *

Create a collation object


In [ ]:
collation = Collation()

Now open the texts in "../fixtures/Woolf/Lighthouse-1", read them, and add them to the collation:


In [ ]:
with open( "../fixtures/Woolf/Lighthouse-1/Lighthouse-1-USA.txt", encoding='utf-8' ) as witness_USA, \
    open( "../fixtures/Woolf/Lighthouse-1/Lighthouse-1-UK.txt", encoding='utf-8' ) as witness_UK, \
    open( "../fixtures/Woolf/Lighthouse-1/Lighthouse-1-EM.txt", encoding='utf-8' ) as witness_EM:
    collation.add_plain_witness( "USA", witness_USA.read() )
    collation.add_plain_witness( "UK", witness_UK.read() )
    collation.add_plain_witness( "EM", witness_EM.read() )

Align, using the HTML output option


In [ ]:
alignment_table = collate(collation, layout='vertical', output='html')

Second exercise

In the second exercise, repeat the previous steps, now using the texts at "../fixtures/Woolf/Lighthouse-2" and visualizing the output with the more sophisticated HTML option (html2).


In [ ]:
collation = Collation()
witness_USA = open( "../fixtures/Woolf/Lighthouse-2/Lighthouse-2-USA.txt", encoding='utf-8' ).read()
witness_UK = open( "../fixtures/Woolf/Lighthouse-2/Lighthouse-2-UK.txt", encoding='utf-8' ).read()
witness_EM = open( "../fixtures/Woolf/Lighthouse-2/Lighthouse-2-EM.txt", encoding='utf-8' ).read()
collation.add_plain_witness( "USA", witness_USA )
collation.add_plain_witness( "UK", witness_UK )
collation.add_plain_witness( "EM", witness_EM )
alignment_table = collate(collation, output='html2')

If you don’t like the colors, you can use the html output option:


In [ ]:
alignment_table = collate(collation, output='html', layout='vertical')