Write a function that extracts possible word combinations of the length $2$ from the same file, shore_leave.txt
. Note that the last word of one sentence, and the first word of the next one are not a good combination.
Splitting the lines into the sentences can be easier using .split()
function: it's argument can be the separator that you are intended to use.
bigramize(filename)
that will take the name of the file as input, and return the list of bigrams.shore_leave.txt
In [ ]:
def bigramize(filename):
pass
Test your function:
In [ ]:
bigrams = bigramize("shore_leave.txt")
In [ ]:
def ngramize(filename, n):
pass
Test your function.
In [ ]:
ngrams = ngramize("shore_leave.txt", 3)
Write a function that will generate text based on the list of bigrams.
generate(bigrams, word=None, maxlen=20)
, where bigrams
is a list of bigrams, word
is the first word in the generated sentence, and maxlen
is the maximum length of the resulting sentence.maxlen
when possible. If there are no continuations of some word, just return the current sequence,
In [ ]:
def generate(bigrams, word=None, maxlen=20):
pass
Test your function.
In [ ]:
print(generate(bigrams))
print(generate(bigrams, "flowers")) # shouldn't rewrite the word
print(generate(bigrams, "sequential", 10)) # should rewrite the word
In [ ]: