Chapter 9: Word play


Contents


This notebook is based on "Think Python, 2Ed" by Allen B. Downey
https://greenteapress.com/wp/think-python-2e/


Reading word lists

  • The built-in function open opens a file (specified as the argument) and returns a file object

In [4]:
input_file = open( 'data/short-words.txt' )
print( input_file )


<_io.TextIOWrapper name='data/short-words.txt' mode='r' encoding='UTF-8'>
  • The book says fin is an acceptable name, but I opt for a more descriptive name
  • There are a number of methods for reading and writing files, including:
    • read( size ) Reads size bytes of data. If size is omitted or negative, the entire file is readn and return. Returns an empty string if the end of the file (EOF) is reached.
    • readline() Reads a single line from the file
    • write( a_string ) Writes a string to the file
    • close() Closes the file object and frees up any system resources
  • You can also use a for loop to read each line of the file

In [6]:
for line in input_file:
    word = line.strip()
    print( word )


abroad
battlefield
chapter
deliver
glockenspiel
institutional
  • The strip method removes whitespace at the beginning and end of a string
  • Most of the exercises in this chapter have something in common
  • They all involve searching a string for specific characters

In [8]:
def has_no_e( word ):
    result = True
    for letter in word:
        if( 'e' == letter ):
            result = False
    return result

input_file = open( 'data/short-words.txt' )
for line in input_file:
    word = line.strip()
    if( has_no_e( word ) ):
        print( 'No `e`: ', word )


No `e`:  abroad
No `e`:  institutional
  • The for loop traverses each letter in the word looking for an e
  • In fact, if you paid very good attention, you will see that the uses_all and uses_only functions in the book are the same
  • In computer science, we frequently encounter problems that are essentially the same as ones we have already solved, but are just worded differently
  • When you find one (called problem recognition), you can apply a previously developed solution
  • How much work you need to do to apply it is dependent on how general your solution is
  • This is an essential skill for problem-solving in general and not just programming

Looping with indices

  • The previous code didn't have a need to use the indices of characters so the simple for ... in loop was used
  • There are a number of ways to traverse a string while maintaining a current index
    1. Use a for loop across the range of the length of the string
    2. Use recursion
    3. Use a while loop and maintain the current index
  • I recommend the first option as it lets the for loop maintain the index
  • Recursion is more complex than necessary for this problem
  • A while loop can be used, but isn't as well suited since we know exactly how many times we need to run through the loop
  • Examples of all three options are below

In [3]:
fruit = 'banana'

# For loop
for i in range( len( fruit ) ):
    print( 'For: [',i,']=[',fruit[i],']' )

# Recursive function
def recurse_through_string( word, i ):
    print( 'Recursive: [',i,']=[',fruit[i],']' )
    if( (i + 1) < len( word ) ):
        recurse_through_string( word, i + 1 )

recurse_through_string( fruit, 0 )
        
# While loop
i = 0
while( i < len( fruit ) ):
    print( 'While: [',i,']=[',fruit[i],']' )
    i = i + 1


For: [ 0 ]=[ b ]
For: [ 1 ]=[ a ]
For: [ 2 ]=[ n ]
For: [ 3 ]=[ a ]
For: [ 4 ]=[ n ]
For: [ 5 ]=[ a ]
Recursive: [ 0 ]=[ b ]
Recursive: [ 1 ]=[ a ]
Recursive: [ 2 ]=[ n ]
Recursive: [ 3 ]=[ a ]
Recursive: [ 4 ]=[ n ]
Recursive: [ 5 ]=[ a ]
While: [ 0 ]=[ b ]
While: [ 1 ]=[ a ]
While: [ 2 ]=[ n ]
While: [ 3 ]=[ a ]
While: [ 4 ]=[ n ]
While: [ 5 ]=[ a ]

Debugging

  • Testing is hard
  • The programs discussed in this chapter are relatively easy to test since you can check the results by hand
  • There are ways to make testing easier and more effective
  • One is to ensure you have different variations of a test
  • For example, for the words with an e function, test using words that have an e at the beginning, middle and end. Test long and short words (including the empty string).
  • Often you will come across special cases (like the empty string) that can throw your program off if you don't have a robust solution
  • Another option is finding large sets of data (like the words list file) against which you can test your program
  • However, if your program requires you to manually inspect the tests for correctness, you are always at risk of missing something
  • The best option is automated testing
  • For example, wrapping your tests in conditionals that only print out if the test fails is a good start
  • In later courses, I will discuss libraries that make automated testing easier
  • Remember that although it feels like more work to write tests, it saves quite a bit of time in the long run

Exercises

  • Write a program that reads words.txt and prints only the words with more than 20 characters (not counting whitespace). (Ex. 9.1 on pg. 84)
  • Generalize the has_no_e function to a function called avoids that takes a word and a string of forbidden letters. It should return True if the word does not contain any of the forbidden letters and False if it does. (Ex. 9.3 on pg. 84)
  • Write a function called uses_only that takes a word and a string of letters, and returns True if the word contains only letters in the list. (Ex. 9.4 on pg. 84)