Python Feature Extraction for Timbl

(C) 2017-2019 by Damir Cavar

Download: This and various other Jupyter notebooks are available from my GitHub repo.

This is a tutorial related to the discussion of feature extraction for classification in the textbook Machine Learning: The Art and Science of Algorithms that Make Sense of Data by Peter Flach.

This tutorial was developed as part of my course material for the course Machine Learning for Computational Linguistics in the Computational Linguistics Program of the Department of Linguistics at Indiana University.

Extracting Features

To extract features from text, we will need a tokenizer. In the following code we load the NLTK word_tokenize function:


In [18]:
from nltk import word_tokenize

In the following code we declare a text variable text1 and set its content to a randomly selected text from a Wikipedia article:


In [19]:
text1 = """The desert tortoises (Gopherus agassizii and Gopherus morafkai) are two species of tortoise
native to the Mojave and Sonoran Deserts of the southwestern United States and northwestern Mexico
and the Sinaloan thornscrub of northwestern Mexico. G. agassizii is distributed in western
Arizona, southeastern California, southern Nevada, and southwestern Utah. The specific name
agassizii is in honor of Swiss-American zoologist Jean Louis Rodolphe Agassiz. Recently, on the
basis of DNA, geographic, and behavioral differences between desert tortoises east and west of
the Colorado River, it was decided that two species of desert tortoises exist: the Agassiz's
desert tortoise (Gopherus agassizii) and Morafka's desert tortoise (Gopherus morafkai). G.
morafkai occurs east of the Colorado River in Arizona, as well as in the states of Sonora and
Sinaloa, Mexico. This species may be a composite of two species.

The new species name is in honor of the late Professor David Joseph Morafka of California State
University, Dominguez Hills, in recognition of his many contributions to the study and
conservation of Gopherus.

The desert tortoises live about 50 to 80 years; they grow slowly and generally have low
reproductive rates. They spend most of their time in burrows, rock shelters, and pallets to
regulate body temperature and reduce water loss. They are most active after seasonal rains and
are inactive during most of the year. This inactivity helps reduce water loss during hot periods,
whereas winter hibernation facilitates survival during freezing temperatures and low food
availability. Desert tortoises can tolerate water, salt, and energy imbalances on a daily basis,
which increases their lifespans."""

The following code lowers all uppercase characters in the text and tokenizes it. The tokens are stored in the tokens1 variable.


In [20]:
tokens1 = word_tokenize(text1.lower())

print(tokens1)


['the', 'desert', 'tortoises', '(', 'gopherus', 'agassizii', 'and', 'gopherus', 'morafkai', ')', 'are', 'two', 'species', 'of', 'tortoise', 'native', 'to', 'the', 'mojave', 'and', 'sonoran', 'deserts', 'of', 'the', 'southwestern', 'united', 'states', 'and', 'northwestern', 'mexico', 'and', 'the', 'sinaloan', 'thornscrub', 'of', 'northwestern', 'mexico', '.', 'g.', 'agassizii', 'is', 'distributed', 'in', 'western', 'arizona', ',', 'southeastern', 'california', ',', 'southern', 'nevada', ',', 'and', 'southwestern', 'utah', '.', 'the', 'specific', 'name', 'agassizii', 'is', 'in', 'honor', 'of', 'swiss-american', 'zoologist', 'jean', 'louis', 'rodolphe', 'agassiz', '.', 'recently', ',', 'on', 'the', 'basis', 'of', 'dna', ',', 'geographic', ',', 'and', 'behavioral', 'differences', 'between', 'desert', 'tortoises', 'east', 'and', 'west', 'of', 'the', 'colorado', 'river', ',', 'it', 'was', 'decided', 'that', 'two', 'species', 'of', 'desert', 'tortoises', 'exist', ':', 'the', "agassiz's", 'desert', 'tortoise', '(', 'gopherus', 'agassizii', ')', 'and', 'morafka', "'s", 'desert', 'tortoise', '(', 'gopherus', 'morafkai', ')', '.', 'g.', 'morafkai', 'occurs', 'east', 'of', 'the', 'colorado', 'river', 'in', 'arizona', ',', 'as', 'well', 'as', 'in', 'the', 'states', 'of', 'sonora', 'and', 'sinaloa', ',', 'mexico', '.', 'this', 'species', 'may', 'be', 'a', 'composite', 'of', 'two', 'species', '.', 'the', 'new', 'species', 'name', 'is', 'in', 'honor', 'of', 'the', 'late', 'professor', 'david', 'joseph', 'morafka', 'of', 'california', 'state', 'university', ',', 'dominguez', 'hills', ',', 'in', 'recognition', 'of', 'his', 'many', 'contributions', 'to', 'the', 'study', 'and', 'conservation', 'of', 'gopherus', '.', 'the', 'desert', 'tortoises', 'live', 'about', '50', 'to', '80', 'years', ';', 'they', 'grow', 'slowly', 'and', 'generally', 'have', 'low', 'reproductive', 'rates', '.', 'they', 'spend', 'most', 'of', 'their', 'time', 'in', 'burrows', ',', 'rock', 'shelters', ',', 'and', 'pallets', 'to', 'regulate', 'body', 'temperature', 'and', 'reduce', 'water', 'loss', '.', 'they', 'are', 'most', 'active', 'after', 'seasonal', 'rains', 'and', 'are', 'inactive', 'during', 'most', 'of', 'the', 'year', '.', 'this', 'inactivity', 'helps', 'reduce', 'water', 'loss', 'during', 'hot', 'periods', ',', 'whereas', 'winter', 'hibernation', 'facilitates', 'survival', 'during', 'freezing', 'temperatures', 'and', 'low', 'food', 'availability', '.', 'desert', 'tortoises', 'can', 'tolerate', 'water', ',', 'salt', ',', 'and', 'energy', 'imbalances', 'on', 'a', 'daily', 'basis', ',', 'which', 'increases', 'their', 'lifespans', '.']

We might want to create a frequency profile from the tokens using Counter from the collections module.


In [21]:
from collections import Counter

We can create a frequency profile from the token-list in the following way:


In [22]:
fp = Counter(tokens1)

print(fp)


Counter({',': 17, 'and': 16, 'of': 16, 'the': 15, '.': 12, 'desert': 7, 'in': 7, 'gopherus': 5, 'species': 5, 'tortoises': 5, 'agassizii': 4, 'to': 4, 'is': 3, ')': 3, 'tortoise': 3, 'during': 3, 'two': 3, 'water': 3, 'are': 3, 'mexico': 3, 'they': 3, 'morafkai': 3, 'most': 3, '(': 3, 'low': 2, 'colorado': 2, 'california': 2, 'a': 2, 'their': 2, 'as': 2, 'loss': 2, 'name': 2, 'honor': 2, 'on': 2, 'g.': 2, 'east': 2, 'river': 2, 'morafka': 2, 'this': 2, 'arizona': 2, 'states': 2, 'southwestern': 2, 'reduce': 2, 'northwestern': 2, 'basis': 2, 'occurs': 1, 'years': 1, 'rates': 1, 'body': 1, 'energy': 1, 'recently': 1, 'nevada': 1, 'differences': 1, 'periods': 1, 'university': 1, 'pallets': 1, 'was': 1, 'hills': 1, 'exist': 1, 'can': 1, 'lifespans': 1, 'seasonal': 1, 'imbalances': 1, 'generally': 1, 'sonora': 1, 'state': 1, 'shelters': 1, 'regulate': 1, 'freezing': 1, 'spend': 1, 'united': 1, 'temperatures': 1, 'year': 1, 'be': 1, 'hot': 1, 'winter': 1, ';': 1, 'southern': 1, 'western': 1, 'his': 1, 'decided': 1, '80': 1, 'native': 1, 'after': 1, "agassiz's": 1, 'dominguez': 1, 'joseph': 1, '50': 1, 'hibernation': 1, 'utah': 1, 'agassiz': 1, 'inactive': 1, 'may': 1, 'time': 1, 'geographic': 1, 'food': 1, 'professor': 1, 'temperature': 1, 'distributed': 1, 'behavioral': 1, 'have': 1, 'deserts': 1, 'zoologist': 1, 'facilitates': 1, 'about': 1, 'sonoran': 1, 'active': 1, 'inactivity': 1, 'daily': 1, 'jean': 1, 'thornscrub': 1, 'whereas': 1, 'new': 1, 'rock': 1, 'dna': 1, 'many': 1, 'sinaloa': 1, 'west': 1, 'reproductive': 1, 'increases': 1, 'rodolphe': 1, 'david': 1, 'burrows': 1, 'tolerate': 1, 'late': 1, 'swiss-american': 1, 'which': 1, 'slowly': 1, 'specific': 1, 'conservation': 1, 'between': 1, 'availability': 1, 'live': 1, 'helps': 1, 'recognition': 1, 'mojave': 1, 'contributions': 1, 'study': 1, 'survival': 1, 'that': 1, 'rains': 1, "'s": 1, 'well': 1, 'louis': 1, ':': 1, 'salt': 1, 'southeastern': 1, 'it': 1, 'composite': 1, 'grow': 1, 'sinaloan': 1})

We could create a model that is using the token, the token frequency from the frequency profile above, and the length of the token:


In [23]:
model = [ (i, fp[i], len(i)) for i in fp ]

print(model)


[('low', 2, 3), ('occurs', 1, 6), ('colorado', 2, 8), ('.', 12, 1), ('years', 1, 5), ('rates', 1, 5), ('body', 1, 4), ('energy', 1, 6), ('recently', 1, 8), ('is', 3, 2), ('nevada', 1, 6), ('differences', 1, 11), ('california', 2, 10), ('periods', 1, 7), ('university', 1, 10), ('pallets', 1, 7), ('was', 1, 3), ('hills', 1, 5), ('a', 2, 1), ('exist', 1, 5), ('can', 1, 3), ('lifespans', 1, 9), (')', 3, 1), ('seasonal', 1, 8), ('imbalances', 1, 10), ('generally', 1, 9), ('their', 2, 5), ('sonora', 1, 6), ('gopherus', 5, 8), ('state', 1, 5), ('shelters', 1, 8), ('the', 15, 3), ('regulate', 1, 8), ('desert', 7, 6), ('freezing', 1, 8), ('spend', 1, 5), ('united', 1, 6), ('temperatures', 1, 12), ('year', 1, 4), ('be', 1, 2), ('hot', 1, 3), ('winter', 1, 6), (';', 1, 1), ('southern', 1, 8), ('as', 2, 2), ('western', 1, 7), ('loss', 2, 4), ('his', 1, 3), ('decided', 1, 7), ('80', 1, 2), ('native', 1, 6), ('after', 1, 5), ("agassiz's", 1, 9), ('dominguez', 1, 9), ('joseph', 1, 6), ('tortoise', 3, 8), ('50', 1, 2), ('hibernation', 1, 11), ('utah', 1, 4), ('during', 3, 6), ('name', 2, 4), ('agassiz', 1, 7), ('two', 3, 3), ('inactive', 1, 8), ('may', 1, 3), ('time', 1, 4), ('geographic', 1, 10), ('food', 1, 4), ('professor', 1, 9), ('temperature', 1, 11), ('and', 16, 3), ('distributed', 1, 11), ('behavioral', 1, 10), ('have', 1, 4), ('honor', 2, 5), ('on', 2, 2), ('deserts', 1, 7), ('g.', 2, 2), ('zoologist', 1, 9), ('facilitates', 1, 11), ('about', 1, 5), ('sonoran', 1, 7), ('species', 5, 7), ('active', 1, 6), ('east', 2, 4), ('water', 3, 5), ('inactivity', 1, 10), ('daily', 1, 5), ('jean', 1, 4), ('thornscrub', 1, 10), ('whereas', 1, 7), ('new', 1, 3), ('rock', 1, 4), ('dna', 1, 3), ('many', 1, 4), ('sinaloa', 1, 7), ('west', 1, 4), ('reproductive', 1, 12), ('agassizii', 4, 9), ('increases', 1, 9), ('rodolphe', 1, 8), ('david', 1, 5), ('river', 2, 5), ('morafka', 2, 7), ('burrows', 1, 7), ('tolerate', 1, 8), ('this', 2, 4), ('late', 1, 4), ('swiss-american', 1, 14), ('of', 16, 2), ('which', 1, 5), ('in', 7, 2), ('are', 3, 3), ('arizona', 2, 7), ('tortoises', 5, 9), ('mexico', 3, 6), ('slowly', 1, 6), ('specific', 1, 8), ('states', 2, 6), ('conservation', 1, 12), ('southwestern', 2, 12), ('between', 1, 7), ('reduce', 2, 6), ('availability', 1, 12), ('live', 1, 4), ('helps', 1, 5), ('recognition', 1, 11), ('mojave', 1, 6), ('they', 3, 4), ('contributions', 1, 13), ('morafkai', 3, 8), ('study', 1, 5), ('survival', 1, 8), ('that', 1, 4), ('rains', 1, 5), ("'s", 1, 2), ('well', 1, 4), ('louis', 1, 5), ('most', 3, 4), (':', 1, 1), ('northwestern', 2, 12), (',', 17, 1), ('salt', 1, 4), ('basis', 2, 5), ('southeastern', 1, 12), ('it', 1, 2), ('composite', 1, 9), ('to', 4, 2), ('grow', 1, 4), ('sinaloan', 1, 8), ('(', 3, 1)]

We can now print the model in a tab-delimited format:


In [24]:
for x in model:
    print( "\t".join( (str(x[1]), str(x[2]), x[0]) ) )


2	3	low
1	6	occurs
2	8	colorado
12	1	.
1	5	years
1	5	rates
1	4	body
1	6	energy
1	8	recently
3	2	is
1	6	nevada
1	11	differences
2	10	california
1	7	periods
1	10	university
1	7	pallets
1	3	was
1	5	hills
2	1	a
1	5	exist
1	3	can
1	9	lifespans
3	1	)
1	8	seasonal
1	10	imbalances
1	9	generally
2	5	their
1	6	sonora
5	8	gopherus
1	5	state
1	8	shelters
15	3	the
1	8	regulate
7	6	desert
1	8	freezing
1	5	spend
1	6	united
1	12	temperatures
1	4	year
1	2	be
1	3	hot
1	6	winter
1	1	;
1	8	southern
2	2	as
1	7	western
2	4	loss
1	3	his
1	7	decided
1	2	80
1	6	native
1	5	after
1	9	agassiz's
1	9	dominguez
1	6	joseph
3	8	tortoise
1	2	50
1	11	hibernation
1	4	utah
3	6	during
2	4	name
1	7	agassiz
3	3	two
1	8	inactive
1	3	may
1	4	time
1	10	geographic
1	4	food
1	9	professor
1	11	temperature
16	3	and
1	11	distributed
1	10	behavioral
1	4	have
2	5	honor
2	2	on
1	7	deserts
2	2	g.
1	9	zoologist
1	11	facilitates
1	5	about
1	7	sonoran
5	7	species
1	6	active
2	4	east
3	5	water
1	10	inactivity
1	5	daily
1	4	jean
1	10	thornscrub
1	7	whereas
1	3	new
1	4	rock
1	3	dna
1	4	many
1	7	sinaloa
1	4	west
1	12	reproductive
4	9	agassizii
1	9	increases
1	8	rodolphe
1	5	david
2	5	river
2	7	morafka
1	7	burrows
1	8	tolerate
2	4	this
1	4	late
1	14	swiss-american
16	2	of
1	5	which
7	2	in
3	3	are
2	7	arizona
5	9	tortoises
3	6	mexico
1	6	slowly
1	8	specific
2	6	states
1	12	conservation
2	12	southwestern
1	7	between
2	6	reduce
1	12	availability
1	4	live
1	5	helps
1	11	recognition
1	6	mojave
3	4	they
1	13	contributions
3	8	morafkai
1	5	study
1	8	survival
1	4	that
1	5	rains
1	2	's
1	4	well
1	5	louis
3	4	most
1	1	:
2	12	northwestern
17	1	,
1	4	salt
2	5	basis
1	12	southeastern
1	2	it
1	9	composite
4	2	to
1	4	grow
1	8	sinaloan
3	1	(

One way to manipulate the feature set extracted from text is to remove all stopwords, to model distributional properties of tokens based on function words (or stopwords). To use English stopwords, we use the NLTK stopword list. We can add tokens to the stopword list, as for example the missing pronoun us:


In [26]:
from nltk.corpus import stopwords

stopw = stopwords.words("english")
stopw.append("us")

print(stopw)


['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', 'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', 'her', 'hers', 'herself', 'it', 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this', 'that', 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', 'should', 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain', 'aren', 'couldn', 'didn', 'doesn', 'hadn', 'hasn', 'haven', 'isn', 'ma', 'mightn', 'mustn', 'needn', 'shan', 'shouldn', 'wasn', 'weren', 'won', 'wouldn', 'us']

In the following code we declare a function isStopword that returns true, if the parameter is a stopword, and otherwise false. This way we can create a model of words that uses the frequency and the length of a token. The third column provides the class definition, that is 0 for non-stopword, and 1 for stopword.


In [27]:
def isStopword(word):
    if word in stopw:
        return(1)
    return(0)

for x in model:
    print( "\t".join( (str(x[1]), str(x[2]), str(isStopword(x[0]))) ) )


2	3	0
1	6	0
2	8	0
12	1	0
1	5	0
1	5	0
1	4	0
1	6	0
1	8	0
3	2	1
1	6	0
1	11	0
2	10	0
1	7	0
1	10	0
1	7	0
1	3	1
1	5	0
2	1	1
1	5	0
1	3	1
1	9	0
3	1	0
1	8	0
1	10	0
1	9	0
2	5	1
1	6	0
5	8	0
1	5	0
1	8	0
15	3	1
1	8	0
7	6	0
1	8	0
1	5	0
1	6	0
1	12	0
1	4	0
1	2	1
1	3	0
1	6	0
1	1	0
1	8	0
2	2	1
1	7	0
2	4	0
1	3	1
1	7	0
1	2	0
1	6	0
1	5	1
1	9	0
1	9	0
1	6	0
3	8	0
1	2	0
1	11	0
1	4	0
3	6	1
2	4	0
1	7	0
3	3	0
1	8	0
1	3	0
1	4	0
1	10	0
1	4	0
1	9	0
1	11	0
16	3	1
1	11	0
1	10	0
1	4	1
2	5	0
2	2	1
1	7	0
2	2	0
1	9	0
1	11	0
1	5	1
1	7	0
5	7	0
1	6	0
2	4	0
3	5	0
1	10	0
1	5	0
1	4	0
1	10	0
1	7	0
1	3	0
1	4	0
1	3	0
1	4	0
1	7	0
1	4	0
1	12	0
4	9	0
1	9	0
1	8	0
1	5	0
2	5	0
2	7	0
1	7	0
1	8	0
2	4	1
1	4	0
1	14	0
16	2	1
1	5	1
7	2	1
3	3	1
2	7	0
5	9	0
3	6	0
1	6	0
1	8	0
2	6	0
1	12	0
2	12	0
1	7	1
2	6	0
1	12	0
1	4	0
1	5	0
1	11	0
1	6	0
3	4	1
1	13	0
3	8	0
1	5	0
1	8	0
1	4	1
1	5	0
1	2	0
1	4	0
1	5	0
3	4	1
1	1	0
2	12	0
17	1	0
1	4	0
2	5	0
1	12	0
1	2	1
1	9	0
4	2	1
1	4	0
1	8	0
3	1	0

You can copy and paste the model above and use it directly in Timbl.

In the next part we will make use of part-of-speech tags for classification. In order to use the PoS-tagger in NLTK, we need to import the pos_tag module:


In [28]:
from nltk import pos_tag

We can use the token-list from the text above and PoS-tag each token. In the code below we abbreviate the tag to the initial character only, which marks up the main part-of-speech class, ignoring feature details.


In [29]:
tokens1 = word_tokenize(text1.lower())
posTokens = [ (x[0], x[1][0]) for x in pos_tag(tokens1) ]
print(posTokens)


[('the', 'D'), ('desert', 'N'), ('tortoises', 'N'), ('(', '('), ('gopherus', 'N'), ('agassizii', 'N'), ('and', 'C'), ('gopherus', 'N'), ('morafkai', 'N'), (')', ')'), ('are', 'V'), ('two', 'C'), ('species', 'N'), ('of', 'I'), ('tortoise', 'N'), ('native', 'N'), ('to', 'T'), ('the', 'D'), ('mojave', 'N'), ('and', 'C'), ('sonoran', 'N'), ('deserts', 'N'), ('of', 'I'), ('the', 'D'), ('southwestern', 'J'), ('united', 'J'), ('states', 'N'), ('and', 'C'), ('northwestern', 'J'), ('mexico', 'N'), ('and', 'C'), ('the', 'D'), ('sinaloan', 'N'), ('thornscrub', 'N'), ('of', 'I'), ('northwestern', 'J'), ('mexico', 'N'), ('.', '.'), ('g.', 'N'), ('agassizii', 'N'), ('is', 'V'), ('distributed', 'V'), ('in', 'I'), ('western', 'J'), ('arizona', 'N'), (',', ','), ('southeastern', 'J'), ('california', 'N'), (',', ','), ('southern', 'J'), ('nevada', 'N'), (',', ','), ('and', 'C'), ('southwestern', 'J'), ('utah', 'N'), ('.', '.'), ('the', 'D'), ('specific', 'J'), ('name', 'N'), ('agassizii', 'N'), ('is', 'V'), ('in', 'I'), ('honor', 'N'), ('of', 'I'), ('swiss-american', 'J'), ('zoologist', 'N'), ('jean', 'N'), ('louis', 'N'), ('rodolphe', 'N'), ('agassiz', 'N'), ('.', '.'), ('recently', 'R'), (',', ','), ('on', 'I'), ('the', 'D'), ('basis', 'N'), ('of', 'I'), ('dna', 'N'), (',', ','), ('geographic', 'J'), (',', ','), ('and', 'C'), ('behavioral', 'J'), ('differences', 'N'), ('between', 'I'), ('desert', 'N'), ('tortoises', 'N'), ('east', 'R'), ('and', 'C'), ('west', 'J'), ('of', 'I'), ('the', 'D'), ('colorado', 'N'), ('river', 'N'), (',', ','), ('it', 'P'), ('was', 'V'), ('decided', 'V'), ('that', 'I'), ('two', 'C'), ('species', 'N'), ('of', 'I'), ('desert', 'N'), ('tortoises', 'N'), ('exist', 'V'), (':', ':'), ('the', 'D'), ("agassiz's", 'N'), ('desert', 'N'), ('tortoise', 'N'), ('(', '('), ('gopherus', 'J'), ('agassizii', 'N'), (')', ')'), ('and', 'C'), ('morafka', '$'), ("'s", 'P'), ('desert', 'N'), ('tortoise', 'N'), ('(', '('), ('gopherus', 'J'), ('morafkai', 'N'), (')', ')'), ('.', '.'), ('g.', 'N'), ('morafkai', 'N'), ('occurs', 'V'), ('east', 'N'), ('of', 'I'), ('the', 'D'), ('colorado', 'N'), ('river', 'N'), ('in', 'I'), ('arizona', 'N'), (',', ','), ('as', 'R'), ('well', 'R'), ('as', 'I'), ('in', 'I'), ('the', 'D'), ('states', 'N'), ('of', 'I'), ('sonora', 'N'), ('and', 'C'), ('sinaloa', 'N'), (',', ','), ('mexico', 'N'), ('.', '.'), ('this', 'D'), ('species', 'N'), ('may', 'M'), ('be', 'V'), ('a', 'D'), ('composite', 'J'), ('of', 'I'), ('two', 'C'), ('species', 'N'), ('.', '.'), ('the', 'D'), ('new', 'J'), ('species', 'N'), ('name', 'N'), ('is', 'V'), ('in', 'I'), ('honor', 'N'), ('of', 'I'), ('the', 'D'), ('late', 'J'), ('professor', 'N'), ('david', 'N'), ('joseph', 'N'), ('morafka', 'N'), ('of', 'I'), ('california', 'N'), ('state', 'N'), ('university', 'N'), (',', ','), ('dominguez', 'N'), ('hills', 'N'), (',', ','), ('in', 'I'), ('recognition', 'N'), ('of', 'I'), ('his', 'P'), ('many', 'J'), ('contributions', 'N'), ('to', 'T'), ('the', 'D'), ('study', 'N'), ('and', 'C'), ('conservation', 'N'), ('of', 'I'), ('gopherus', 'N'), ('.', '.'), ('the', 'D'), ('desert', 'N'), ('tortoises', 'V'), ('live', 'V'), ('about', 'I'), ('50', 'C'), ('to', 'T'), ('80', 'C'), ('years', 'N'), (';', ':'), ('they', 'P'), ('grow', 'V'), ('slowly', 'R'), ('and', 'C'), ('generally', 'R'), ('have', 'V'), ('low', 'J'), ('reproductive', 'J'), ('rates', 'N'), ('.', '.'), ('they', 'P'), ('spend', 'V'), ('most', 'J'), ('of', 'I'), ('their', 'P'), ('time', 'N'), ('in', 'I'), ('burrows', 'N'), (',', ','), ('rock', 'N'), ('shelters', 'N'), (',', ','), ('and', 'C'), ('pallets', 'N'), ('to', 'T'), ('regulate', 'V'), ('body', 'N'), ('temperature', 'N'), ('and', 'C'), ('reduce', 'V'), ('water', 'N'), ('loss', 'N'), ('.', '.'), ('they', 'P'), ('are', 'V'), ('most', 'R'), ('active', 'J'), ('after', 'I'), ('seasonal', 'J'), ('rains', 'N'), ('and', 'C'), ('are', 'V'), ('inactive', 'J'), ('during', 'I'), ('most', 'J'), ('of', 'I'), ('the', 'D'), ('year', 'N'), ('.', '.'), ('this', 'D'), ('inactivity', 'N'), ('helps', 'V'), ('reduce', 'V'), ('water', 'N'), ('loss', 'N'), ('during', 'I'), ('hot', 'J'), ('periods', 'N'), (',', ','), ('whereas', 'N'), ('winter', 'V'), ('hibernation', 'N'), ('facilitates', 'N'), ('survival', 'V'), ('during', 'I'), ('freezing', 'V'), ('temperatures', 'N'), ('and', 'C'), ('low', 'J'), ('food', 'N'), ('availability', 'N'), ('.', '.'), ('desert', 'J'), ('tortoises', 'N'), ('can', 'M'), ('tolerate', 'V'), ('water', 'N'), (',', ','), ('salt', 'N'), (',', ','), ('and', 'C'), ('energy', 'N'), ('imbalances', 'N'), ('on', 'I'), ('a', 'D'), ('daily', 'J'), ('basis', 'N'), (',', ','), ('which', 'W'), ('increases', 'V'), ('their', 'P'), ('lifespans', 'N'), ('.', '.')]

For mapping of tags on numerical feature vectors, we might need the list of all tags used in the language data above. We apply list comprehension to posTokens, using only the tag element from the tuple above. The list of all tags we convert into a list to remove duplicates. The resulting set is converted into an ordered list.


In [30]:
tags = list( set( [ x[1] for x in posTokens ] ) )
print(tags)


['P', ')', 'C', '.', 'D', 'R', ',', 'J', ':', 'W', 'N', 'I', 'V', 'M', '$', 'T', '(']

To keep track of the frequencies of all tags left and right of any given token, we create dictionaries that have Counter-objects as values. The keys of the top-level dictionaries leftOfToken and rightOfToken are the token tuples that consist of token and tag. The keys of the Counter-values for the token keys are tags. The values associated with the tag-keys are frequencies. leftOfToken counts the frequency of tags to the immediate left of a token. rightOfToken counts the frequency of tags to the immediate right of a token.


In [31]:
from collections import defaultdict

leftOfToken = defaultdict(Counter)
rightOfToken = defaultdict(Counter)

lenPosTokens = len(posTokens)

for i in range(lenPosTokens):
    token = posTokens[i]
    if i > 0:
        ltag = posTokens[i - 1][1]
        leftOfToken[token][ltag] += 1
    if i < lenPosTokens - 1:
        rtag = posTokens[i + 1][1]
        rightOfToken[token][rtag] += 1

We can verify that the frequency tables have been created correctly:


In [32]:
print(leftOfToken)


defaultdict(<class 'collections.Counter'>, {('g.', 'N'): Counter({'.': 2}), ('rodolphe', 'N'): Counter({'N': 1}), ('survival', 'V'): Counter({'N': 1}), ('periods', 'N'): Counter({'J': 1}), ('their', 'P'): Counter({'I': 1, 'V': 1}), ('east', 'R'): Counter({'N': 1}), ('gopherus', 'J'): Counter({'(': 2}), ('contributions', 'N'): Counter({'J': 1}), ('species', 'N'): Counter({'C': 3, 'D': 1, 'J': 1}), ('daily', 'J'): Counter({'D': 1}), ('desert', 'J'): Counter({'.': 1}), ('loss', 'N'): Counter({'N': 2}), ('the', 'D'): Counter({'I': 7, '.': 3, 'T': 2, ':': 1, 'C': 1}), ('about', 'I'): Counter({'V': 1}), ('conservation', 'N'): Counter({'C': 1}), ('and', 'C'): Counter({'N': 9, ',': 4, 'R': 2, ')': 1}), ('utah', 'N'): Counter({'J': 1}), ('well', 'R'): Counter({'R': 1}), ('salt', 'N'): Counter({',': 1}), ('morafka', 'N'): Counter({'N': 1}), ('tortoises', 'V'): Counter({'N': 1}), ('regulate', 'V'): Counter({'T': 1}), ('honor', 'N'): Counter({'I': 2}), ('is', 'V'): Counter({'N': 3}), ('state', 'N'): Counter({'N': 1}), ('recently', 'R'): Counter({'.': 1}), ('agassiz', 'N'): Counter({'N': 1}), ('study', 'N'): Counter({'D': 1}), ('two', 'C'): Counter({'I': 2, 'V': 1}), ('on', 'I'): Counter({',': 1, 'N': 1}), ('they', 'P'): Counter({'.': 2, ':': 1}), ('as', 'I'): Counter({'R': 1}), ('jean', 'N'): Counter({'N': 1}), ('thornscrub', 'N'): Counter({'N': 1}), ('many', 'J'): Counter({'P': 1}), ('this', 'D'): Counter({'.': 2}), ('water', 'N'): Counter({'V': 3}), ('have', 'V'): Counter({'R': 1}), ('may', 'M'): Counter({'N': 1}), ('basis', 'N'): Counter({'D': 1, 'J': 1}), ("'s", 'P'): Counter({'$': 1}), ('sinaloa', 'N'): Counter({'C': 1}), (':', ':'): Counter({'V': 1}), ('be', 'V'): Counter({'M': 1}), ('tortoise', 'N'): Counter({'N': 2, 'I': 1}), ('80', 'C'): Counter({'T': 1}), ('united', 'J'): Counter({'J': 1}), ('colorado', 'N'): Counter({'D': 2}), ('east', 'N'): Counter({'V': 1}), ('gopherus', 'N'): Counter({'I': 1, '(': 1, 'C': 1}), ('slowly', 'R'): Counter({'V': 1}), ('of', 'I'): Counter({'N': 12, 'J': 4}), ('geographic', 'J'): Counter({',': 1}), ('west', 'J'): Counter({'C': 1}), ('southern', 'J'): Counter({',': 1}), ('louis', 'N'): Counter({'N': 1}), ('to', 'T'): Counter({'N': 3, 'C': 1}), ('distributed', 'V'): Counter({'V': 1}), ('pallets', 'N'): Counter({'C': 1}), ('50', 'C'): Counter({'I': 1}), ('was', 'V'): Counter({'P': 1}), ('zoologist', 'N'): Counter({'J': 1}), ('differences', 'N'): Counter({'J': 1}), ('dominguez', 'N'): Counter({',': 1}), ('mexico', 'N'): Counter({'J': 2, ',': 1}), ('swiss-american', 'J'): Counter({'I': 1}), ('hot', 'J'): Counter({'I': 1}), ('behavioral', 'J'): Counter({'C': 1}), ('generally', 'R'): Counter({'C': 1}), ('river', 'N'): Counter({'N': 2}), ('new', 'J'): Counter({'D': 1}), ('morafkai', 'N'): Counter({'N': 2, 'J': 1}), ('rains', 'N'): Counter({'J': 1}), ('availability', 'N'): Counter({'N': 1}), ('states', 'N'): Counter({'D': 1, 'J': 1}), ('it', 'P'): Counter({',': 1}), ('which', 'W'): Counter({',': 1}), ('winter', 'V'): Counter({'N': 1}), ('professor', 'N'): Counter({'J': 1}), ('inactivity', 'N'): Counter({'D': 1}), ('david', 'N'): Counter({'N': 1}), ('between', 'I'): Counter({'N': 1}), ('.', '.'): Counter({'N': 11, ')': 1}), ('after', 'I'): Counter({'J': 1}), ('inactive', 'J'): Counter({'V': 1}), ('native', 'N'): Counter({'N': 1}), ('sonoran', 'N'): Counter({'C': 1}), ('morafka', '$'): Counter({'C': 1}), ('body', 'N'): Counter({'V': 1}), ('specific', 'J'): Counter({'D': 1}), ('low', 'J'): Counter({'V': 1, 'C': 1}), ('southwestern', 'J'): Counter({'D': 1, 'C': 1}), ('reduce', 'V'): Counter({'V': 1, 'C': 1}), ('during', 'I'): Counter({'V': 1, 'J': 1, 'N': 1}), ('composite', 'J'): Counter({'D': 1}), ('joseph', 'N'): Counter({'N': 1}), ('spend', 'V'): Counter({'P': 1}), ('live', 'V'): Counter({'V': 1}), ('california', 'N'): Counter({'I': 1, 'J': 1}), ('whereas', 'N'): Counter({',': 1}), ('lifespans', 'N'): Counter({'P': 1}), ('mojave', 'N'): Counter({'D': 1}), ('(', '('): Counter({'N': 3}), ('temperatures', 'N'): Counter({'V': 1}), ('sonora', 'N'): Counter({'I': 1}), ('occurs', 'V'): Counter({'N': 1}), ('most', 'R'): Counter({'V': 1}), ('reproductive', 'J'): Counter({'J': 1}), ('his', 'P'): Counter({'I': 1}), ('late', 'J'): Counter({'D': 1}), ('recognition', 'N'): Counter({'I': 1}), ('a', 'D'): Counter({'I': 1, 'V': 1}), ('university', 'N'): Counter({'N': 1}), ('hibernation', 'N'): Counter({'V': 1}), ('energy', 'N'): Counter({'C': 1}), ('arizona', 'N'): Counter({'I': 1, 'J': 1}), ('name', 'N'): Counter({'J': 1, 'N': 1}), ('increases', 'V'): Counter({'W': 1}), ('freezing', 'V'): Counter({'I': 1}), ('most', 'J'): Counter({'I': 1, 'V': 1}), ('are', 'V'): Counter({'P': 1, ')': 1, 'C': 1}), ('active', 'J'): Counter({'R': 1}), ('hills', 'N'): Counter({'N': 1}), ('exist', 'V'): Counter({'N': 1}), ('southeastern', 'J'): Counter({',': 1}), ('helps', 'V'): Counter({'N': 1}), ('sinaloan', 'N'): Counter({'D': 1}), ('nevada', 'N'): Counter({'J': 1}), ('burrows', 'N'): Counter({'I': 1}), ('desert', 'N'): Counter({'D': 2, 'I': 2, 'P': 1, 'N': 1}), ('decided', 'V'): Counter({'V': 1}), ('time', 'N'): Counter({'P': 1}), ('dna', 'N'): Counter({'I': 1}), ('imbalances', 'N'): Counter({'N': 1}), ('facilitates', 'N'): Counter({'N': 1}), ('tolerate', 'V'): Counter({'M': 1}), ('years', 'N'): Counter({'C': 1}), ('year', 'N'): Counter({'D': 1}), ('rock', 'N'): Counter({',': 1}), (')', ')'): Counter({'N': 3}), ('northwestern', 'J'): Counter({'I': 1, 'C': 1}), (',', ','): Counter({'N': 15, 'J': 1, 'R': 1}), ('in', 'I'): Counter({'V': 3, 'N': 2, 'I': 1, ',': 1}), ('as', 'R'): Counter({',': 1}), ('rates', 'N'): Counter({'J': 1}), ('western', 'J'): Counter({'I': 1}), ('temperature', 'N'): Counter({'N': 1}), ('seasonal', 'J'): Counter({'I': 1}), ('shelters', 'N'): Counter({'N': 1}), ('that', 'I'): Counter({'V': 1}), ('agassizii', 'N'): Counter({'N': 3, 'J': 1}), ("agassiz's", 'N'): Counter({'D': 1}), ('deserts', 'N'): Counter({'N': 1}), (';', ':'): Counter({'N': 1}), ('food', 'N'): Counter({'J': 1}), ('tortoises', 'N'): Counter({'N': 3, 'J': 1}), ('grow', 'V'): Counter({'P': 1}), ('can', 'M'): Counter({'N': 1})})

To generate a model, we create a numerical vector with the frequencies of the tags left and right of each token. The frequencies are sorted by position of the tag using the tags-list created above. The left half of the numerical vector represents the left context, the right half the frequencies of the tags to the right of a token. The final element of the row is the tag of the token that is represented by the tag-context frequency vector:


In [33]:
for token in leftOfToken.keys():
    leftVector = []
    rightVector = []
    for tag in tags:
        leftVector.append(leftOfToken[token][tag])
        rightVector.append(rightOfToken[token][tag])
    print(" ".join([ str(x) for x in leftVector ]), " ".join([ str(x) for x in rightVector ]), token[1])


0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 P
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 J
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 N
0 0 3 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 2 0 1 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 1 3 0 0 0 0 1 0 0 7 0 0 0 2 0 0 0 0 0 0 0 0 4 0 0 11 0 0 0 0 0 0 D
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 I
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 1 0 0 0 2 4 0 0 0 9 0 0 0 0 0 0 0 0 0 0 1 1 0 5 0 0 6 0 2 0 1 0 0 C
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 R
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 C
0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 I
0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 P
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 D
0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 N
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 M
0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 P
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 :
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 N
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 C
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 4 0 0 12 0 0 0 0 0 0 2 0 1 0 5 0 0 2 0 0 6 0 0 0 0 0 0 I
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 J
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 1 0 0 0 0 T
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 C
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 P
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 W
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 I
0 1 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 2 0 0 0 5 1 0 1 0 0 2 0 0 0 0 0 0 .
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 $
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 J
0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 I
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 (
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 P
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 D
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 J
1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 V
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
1 0 0 0 2 0 0 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 )
0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 J
0 0 0 0 0 1 0 1 0 0 15 0 0 0 0 0 0 1 0 4 0 0 1 0 3 0 1 5 2 0 0 0 0 0 ,
0 0 0 0 0 0 1 0 0 0 2 1 3 0 0 0 0 0 0 0 0 1 0 0 1 0 0 5 0 0 0 0 0 0 I
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 I
0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 :
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 M

The model above can be used as a training model with Timbl. Copy and paste it into a training model file. In the following we will create another model from some random Wikipedia text. We generate the model for this new text as above.


In [35]:
text2 = """Desert tortoises spend most of their lives in burrows, rock
shelters, and pallets to regulate body temperature and reduce water loss.
Burrows are tunnels dug into soil by desert tortoises or other animals,
rock shelters are spaces protected by rocks and/or boulders, and pallets
are depressions in the soil. The use of the various shelter types is
related to their availability and climate. The number of burrows used,
the extent of repetitive use, and the occurrence of burrow sharing are
variable. Males tend to occupy deeper burrows than females. Seasonal
trends in burrow use are influenced by desert tortoise gender and
regional variation. Desert tortoise shelter sites are often associated
with plant or rock cover. Desert tortoises often lay their eggs in
nests dug in sufficiently deep soil at the entrance of burrows or
under shrubs. Nests are typically 3 to 10 inches (8–25 cm) deep.

Shelters are important for controlling body temperature and water
regulation, as they allow desert tortoises to slow their rate of heating
in summer and provide protection from cold during the winter. The
humidity within burrows prevents dehydration. Burrows also provide
protection from predators. The availability of adequate burrow sites
influences desert tortoise densities.

The number of burrows used by desert tortoises varies spatially and
temporally, from about 5 to 25 per year. Some burrows are used repeatedly,
sometimes for several consecutive years. Desert tortoises share burrows
with various mammals, reptiles, birds, and invertebrates, such as
white-tailed antelope squirrels (Ammospermophilus leucurus), woodrats
(Neotoma), collared peccaries (Pecari tajacu), burrowing owls (Athene
cunicularia), Gambel's quail (Callipepla gambelii), rattlesnakes (Crotalus
spp.), Gila monsters (Heloderma suspectum), beetles, spiders, and
scorpions. One burrow can host up to 23 desert tortoises – such sharing
is more common for desert tortoises of opposite sexes than for desert
tortoises of the same sex.
"""

tokens2 = word_tokenize(text2.lower())
posTokensB = [ (token[0], token[1][0] ) for token in pos_tag(tokens2) ]

leftOfToken2 = defaultdict(Counter)
rightOfToken2 = defaultdict(Counter)

lenPosTokensB = len(posTokensB)

for i in range(lenPosTokensB):
    token = posTokensB[i]
    if i > 0:
        ltag = posTokensB[i - 1][1]
        leftOfToken2[token][ltag] += 1
    if i < lenPosTokensB - 1:
        rtag = posTokensB[i + 1][1]
        rightOfToken2[token][rtag] += 1

for token in leftOfToken2.keys():
    leftVector = []
    rightVector = []
    for tag in tags:
        leftVector.append(leftOfToken2[token][tag])
        rightVector.append(rightOfToken2[token][tag])
    print(" ".join([ str(x) for x in leftVector ]), " ".join([ str(x) for x in rightVector ]), token[1])


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 C
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 1 5 0 0 0 5 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 6 0 2 0 0 0 0 C
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 V
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 P
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 V
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 C
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 M
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 V
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 R
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 2 0 0 1 0 0 0 0 2 0 2 0 0 0 0 1 0 3 0 0 0 0 0 0 0 0 0 3 0 0 0 0 T
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 J
0 0 0 1 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 I
0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 I
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 C
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 N
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 0 0 I
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 V
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 8 0 1 0 0 0 0 0 0 0 0 0 2 0 3 0 0 2 0 2 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 J
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 D
0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 P
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 1 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 1 0 0 0 0 0 0 )
0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 N
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 I
0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 C
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 R
0 0 1 5 0 0 1 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 10 0 0 0 0 0 0 D
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 1 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 I
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 P
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 R
0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 I
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 C
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 R
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 10 0 0 0 0 0 0 1 0 0 0 2 0 0 3 0 0 5 0 0 0 0 0 0 I
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 N
0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 7 0 0 0 2 0 0 0 0 12 0 1 0 0 0 0 0 0 5 0 1 1 0 3 0 0 9 2 1 0 0 0 0 ,
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 1 0 0 17 0 0 0 0 0 0 0 1 1 0 6 0 0 2 0 0 7 0 0 0 0 0 0 .
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 C
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N
1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 1 0 0 0 0 5 0 0 1 0 1 0 0 0 0 (
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 V
0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 I
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 V
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 V
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 J
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 1 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 1 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 2 1 0 0 1 0 0 0 5 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 4 0 0 0 0 N
0 0 1 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 N
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 J
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 1 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 1 1 0 0 0 N
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 4 0 0 5 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 2 2 2 0 0 1 0 N
0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 N
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 5 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 4 0 0 0 0 0 0 I
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 C
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 N
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 C
0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 N
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N

You can copy and paste this model into a new test-file for Timbl.