1. Make a list of all files in the current directory worksheets but remove the file extnesions and the leading numbers in the filename.
In [1]:
import os
import re
import glob
pat = re.compile(r'^\d+_')
file_list = []
for f in glob.glob('../worksheet/*'):
path, filename = os.path.split(f)
name, ext = os.path.splitext(filename)
short_name = pat.sub('', name)
file_list.append(short_name)
2. Count how often the word markdown occurs in all files in the current directory worksheets.
In [2]:
count = 0
for f in glob.glob('../worksheet/*'):
with open(f) as fin:
text = fin.read()
count += text.count('markdown')
count
Out[2]:
3. Write the following string to a file called jabberwocky.txt.
Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
Now read in the file line by line and save only the lines that conatin two or more words starting with the same letter into a new file called truncated_jabberwocky.txt.
In [3]:
s = '''Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.'''
In [4]:
pat = re.compile(r'.*')
In [5]:
with open('jabberwocky.txt') as fin:
with open('truncated_jabberwocky.txt', 'w') as fout:
for line in fin:
start_letters = [word[0] for word in line.lower().strip().split()]
counts = [start_letters.count(char) >= 2 for char in set(start_letters)]
if any(counts):
fout.write(line)
In [6]:
cat truncated_jabberwocky.txt
4. Read the file jabberwocky.txt again. Using a regular expression, replace all words containing two or more vowels with the reversed word - e.g. brillig would become gillirb.
In [7]:
two_vowel_word = re.compile(r'\b(\S*[a|e|i|o|u]\S*[a|e|i|o|u]\S*)\b')
In [8]:
two_vowel_word = re.compile(r'\b(\S*[aeiou]\S*[aeiou]\S*)\b')
In [9]:
with open('jabberwocky.txt') as fin:
text = fin.read()
for word in two_vowel_word.findall(text):
text = text.replace(word, word[::-1])
print(text)