These two files
are very similar. One file is has four extra words in it. Identify those four words.
Inspection revealed that those files have one word per line.
In [1]:
def get_words_set(filename):
return set(word.strip() for word in open(filename))
In [2]:
a = get_words_set('113809of.fic')
b = get_words_set('113809of.rev.2.fic')
In [3]:
len(a), len(b)
Out[3]:
In [4]:
a - b
Out[4]: