5. Squared Wheel and Other Reinventions

Many a times, we hear discussions about writing programs using Python in the "most Pythonic way." Sometimes it's more of a philosophy; however, quite often there might be a more concrete message: "Do not re-invent the wheel." Indeed, Python is a wonderful language full of lots of built-in libraries. If you need anything, it is most likely implemented in the language. It takes no more than a dozen lines of code to solve many common problems. This is the result of Python's "batteries included" approach to design of the language and libraries.

Read this article for more information: https://docs.python.org/3/tutorial/stdlib.html

To illustrate some of these ideas, let's consider a couple different computational tasks and look at different ways to write a correct program for each.

Example: String concatenation. Let's try to create a string from the words in a list. A common style in other programming languages, like C/C++ or Java, is the "scalar loop form."


In [ ]:
# Task: Concatenate a list of strings into a single string
# delimited by spaces.

list_of_words = ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']

i = 0 # A counter to maintain the current position in the list
new_string = '' # String to hold the output
while i < len(list_of_words): # Iterate over words
    new_string += list_of_words[i]
    i += 1
    # Add a space to join the words together if it's not the last word
    if i < len(list_of_words):
        new_string += ' '

print ("The resulting string is '" + new_string + "'.")

This "pattern" of code is sometimes referred to as a classic "procedural" approach.

Now let's consider a more Pythonic approach.


In [ ]:
list_of_words = ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']

i = 0 # Current position in the list
new_string = '' # String to hold the output

for word in list_of_words: # Iterate over words
    new_string += word
    i += 1
    # Add a space to join the words together if it's not the last word
    if i < len(list_of_words):
        new_string += ' '

print ("The resulting string is '" + new_string + "'.")

A little cleaner, but not too much. The main difference is to (largely) replace the counter-based while-loop with a more idiomatic for-loop, using a syntax for iterating over collections that mimics mathematical notation. (That is, "for each word in list_of_words ...") However, we still need to maintain a counter to omit the last space. But we can do better!


In [ ]:
list_of_words = ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']

# Create and empty string
new_string = ''
# Iterate through all words and enumerate them
for i, word in enumerate(list_of_words):
    new_string += word
    # Add a space to join the words together if it's not the last word
    if i < len(list_of_words)-1:
        new_string += ' '

print ("The resulting string is '" + new_string + "'.")

The counter always increases "in sync" with iteration over the list of words. The enumerate() function captures this pattern succinctly.

Pretty good, but we can do even better.


In [ ]:
list_of_words = ['the', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']
new_string = ' '.join(list_of_words)
print ("The resulting string is '" + new_string + "'.")

A single line of code just solved the whole problem! That's the power of the language and its full-featured library.

Example: Computing the mean. Now let's look at another problem where we compute the mean of all the elements in an array. Then we will calculate the distance from the mean for every element.


In [ ]:
array = [1, 2, 3, 4, 5, 6]

mean = 0
for i in range(len(array)):
    mean += array[i]
mean /= len(array)

dist = []
for i in range(len(array)):
    dist += [array[i] - mean]

print ("The mean of the array", array, "is", mean, "and the distances are", dist)

But we can compute at least the mean with less code using the built-in library function, sum().


In [ ]:
array = [1, 2, 3, 4, 5, 6]

mean = sum(array) / len(array)

dist = []
for i in range(len(array)):
    dist += [array[i] - mean]

print ("The mean of the array", array, "is", mean, "distances are", dist)

Now we can try to compute distances in a more Pythonic way, again using the more idiomatic for-loop syntax for iterating over elements of a collection:


In [ ]:
array = [1, 2, 3, 4, 5, 6]

mean = sum(array) / len(array)

dist = []
for element in array:
    dist += [element - mean]

print ("The mean of the array", array, "is", mean, "distances are", dist)

Finally, we can make it even more compact with list comprehensions, which are designed for "tiny for loops," that is, for loops whose iterations are independent and whose bodies are simple or small functions.


In [ ]:
array = [1, 2, 3, 4, 5, 6]

mean = sum(array) / len(array)
dist = [element - mean for element in array]

print ("The mean of the array", array, "is", mean, "distances are", dist)

Example: Lists to dictionaries. Now let's try to create a new dictionary from two lists.

Suppose we have two lists of first and last names. In this case, the lists are aligned: there is a one-to-one correspondence between elements of one list and the other. Further suppose our task is to create a new dictionary that would allow us to quickly look up the first name, given the last name.


In [ ]:
first_names = ['Leonard', 'Sheldon', 'Howard', 'Rajesh']
last_names = ['Hofstadter', 'Cooper', 'Wolowitz', 'Koothrappali']

name_dict = {}
for name_ind in range(len(last_names)):
    name_dict[last_names[name_ind]] = first_names[name_ind]
print ("Name dictionary is", name_dict)

And now everything is the same, in a more Pythonic way: simultaneously iterating over two collections where there is a one-to-one correspondence is a pattern referred to as a "zipper iteration," which Python handles nicely via its zip() function.


In [ ]:
first_names = ['Leonard', 'Sheldon', 'Howard', 'Rajesh']
last_names = ['Hofstadter', 'Cooper', 'Wolowitz', 'Koothrappali']

name_dict = dict(zip(last_names, first_names))
print ("Name dictionary is", name_dict)

Exercise. Now, enlightened by all this knowledge, let's try to write a function that takes a string, drops all the words that contain letter 'o', and return a new string without these words. In Python, it's truly a single line function :)


In [ ]:
def pick_o(s):
    pass

In [ ]:
s = 'the quick brown fox jumped over the lazy dog'
true_string = 'brown fox over dog'
new_string = pick_o(s)
print("pick_o('{}') -> '{}' [True: '{}']".format(s, new_string, true_string))
assert new_string == true_string