In [ ]:

    
#---------------------------------------------------#
# PYTHON EXERCISES
# September 12, 2016 #
# EECS 445: Machine Learning #
# Author: Valliappa Chockalingam (valli@umich.edu) 
# --------------------------------------------------#

Question 1: Numbers and Data Structures

Example: If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Question: Find the sum of all the multiples of 3 or 5 below 1000 in 3 ways.
Hint 1: Loop through all the possible numbers, i.e., $x \in \mathbb{N} \wedge x < 1000$ and simply add up the multiples as you go through them. Use two loops first and then try making your code more concise.
Bonus: Use the inclusion-exclusion principle. Write a function that calculates the sum of the first $n$ numbers in an arithmetic series. Hint: $S_n = \frac{n}{2}(u_1 + u_n) = \frac{n}{2}(2u_1 + (n - 1) \cdot d)$



In [415]:

    
# Open Ended, No Skeleton Code for this question.

Question 2: Strings and Data Structures

Note: You are free to use any functions and libraries that come with Python. The aim is not to necessarily implement code from scratch, but to get familiar with Python (specifically the syntax and data structures) and, to some extent, write concise readable working code.

(a): Given a non-empty string like "Code" return a string like "CCoCodCode".

(b): Given strings a, b and c, write a function that will replace all occurences of a in b with c.

(c): Given a string s that has been Caeser enciphered with a numeric shift n. Return the deciphered string. For example, s = "vjku ku c eqfg" and n = -2, returns "this is a code" Note: You can assume the string is all in lowercase without any special characters.

(d): Given a string s, return whether s is a palindrome, i.e, it is spelt the same when read from either direction. Note: Ignore case, special characters and spacing.

(e): Using the higher order function filter(), define a function filter_long_words() that takes a list of words and an integer n and returns the list of words that are longer than n (in the same order).

(f): Given a string s and a number n, return a list of size n consisting of tuples with two elements, the first being the correct string in the look-and-say-sequence and the second being a dictionary of character: count pairs that is sorted (Hint: Use Ordered Dictionaries and sorted()). For example, if s = "1" and n = 5, the look-and-say-sequence is as follows:
```
  1 ("1", {"1" : 1})
  11 ("11", {"1" : 2})
  21 ("21", {"1" : 1, "2" : 1})
  1211 ("1211", {"1" : 3, "2" : 1})
  111221 ("111221", {"1" : 4, "2" : 2})
  correct output: [("1", {"1" : 1}), ("11", {"1" : 2}), ("21", {"1" : 1, "2" : 1}), 
  ("1211", {"1" : 3, "2" : 1}), ("111221", {"1" : 4, "2" : 2})]
```



In [64]:

    
# Implement part (a) below
def string_repeater(s):
    pass

# Implement part (b) below
def string_occurence_remover(a, b, c):
    pass

# Implement part (c) below
def caeser_decipher(s, n):
    pass

# Implement part (d) below
def is_palindrome(s):
    pass

# Implement part (e) below
def filter_long_words(l, n):
    pass

# Implement part (f) below
def look_and_say(s, n):
    pass



In [67]:

    
# Simple Tests for part (a)
assert(string_repeater("Code") == "CCoCodCode")
assert(string_repeater("EECS445") == "EEEEECEECSEECS4EECS44EECS445")

# Simple Tests for part (b)
assert(string_occurence_remover("Boring", "PythonIsBoring", "Fun") == "PythonIsFun")
assert(string_occurence_remover("12", "Today is September 12 and 12 is my favorite number.", "9") == \
       "Today is September 9 and 9 is my favorite number.")

# Simple Tests for part (c)
assert(caeser_decipher("vjku ku eqfg", -2) == "this is code")
assert(caeser_decipher("h khjd bnlotsdqr", 1) == "i like computers")

# Simple Tests for part (d)
assert(is_palindrome("Rats live on no evil star."))
assert(is_palindrome("On a clover, if alive, erupts a vast pure evil; a fire volcano"))
assert(not is_palindrome("Hello, this is Jupyter Notebook speaking."))
assert(not is_palindrome("I am currently in a hands-on lecture."))

# Simple Tests for part (e)
assert(filter_long_words(['a', '', '0', 'a0', 'a0b02030', 'ee', 'cs', 'eecs', 'eeccss'], 2) == ['a0b02030', 'eecs', 'eeccss'])
assert(filter_long_words(['1', '2'], 0) == ['1', '2'])

# Simple Test for part (f)
from collections import OrderedDict
assert(look_and_say("aabbcc", 5) == [('aabbcc', OrderedDict([('a', 2), ('b', 2), ('c', 2)])),
 ('2a2b2c', OrderedDict([('2', 3), ('a', 1), ('b', 1), ('c', 1)])),
 ('121a121b121c',
  OrderedDict([('1', 6), ('2', 3), ('a', 1), ('b', 1), ('c', 1)])),
 ('1112111a1112111b1112111c',
  OrderedDict([('1', 18), ('2', 3), ('a', 1), ('b', 1), ('c', 1)])),
 ('3112311a3112311b3112311c',
  OrderedDict([('1', 12), ('2', 3), ('3', 6), ('a', 1), ('b', 1), ('c', 1)]))])

Question 3: Classes and Interactive I/O

(a) Define a class which has at least two methods, getString: to get a string from console input and printString: to print the string in upper case. Also write a simple test to check the functionality class methods.



In [416]:

    
# Open Ended. No Skeleton Code here.

(b) Write a program able to play the "Guess the number"-game, where the number to be guessed is randomly chosen between 1 and 20.

(Source: http://inventwithpython.com) This is how it should work when run in a terminal:

Hello! What is your name?
Valli
Well, Valli, I am thinking of a number between 1 and 20.
Take a guess.
10
Your guess is too low.
Take a guess.
15
Your guess is too low.
Take a guess.
18
Good job, Valli! You guessed my number in 3 guesses!



In [102]:

    
import random
class GuessTheNumber(object):
    # Create a constructor here (__init__ function) that takes two numbers, a minimum and a maximum for 
    # the range that guesses can take. Save these in variable min_guess and max_guess. Create and 
    # initialize a Boolean called incorrect to be true. (Note: Python booleans use capitalization, >T<rue or >F<alse)
    
    def play(self):
        # Change the following line of code to get input from the user and save it into a string variable name.
        print("Hello! What is your name? ")
        print("Well, " + name + ", I am thinking of a number between " + 
              str(self.min_guess) + " and " + str(self.max_guess) + ".\n")
        answer = random.randrange(self.min_guess, self.max_guess + 1)
        # Write the main loop to collect guesses and check whether they are correct. Also remember to save the count!
        # Additionally, if the guess is out of range or input is unexpected (like type mismatch or non-numeric input), 
        # simply print an error message and break from the loop.



In [ ]:

    
# Test out your game!
g = GuessTheNumber(1, 20)
g.play()

Question 4: Pandas and Data Exploration



In [6]:

    
# Generate some Data for analysis
from sklearn.datasets import make_classification
X, y = make_classification(1000, n_features=5, n_informative=2, 
                           n_redundant=2, n_classes=2, random_state=0)



In [417]:

    
# (a) Get a glimpse of the data by making a Pandas DataFrame from the data and then printing the first few and last
# few rows.



In [418]:

    
# (b) Plot a boxplot of each column to visualize the distribution of the data column values.



In [419]:

    
# (c) Try using the describe() function of the DataFrame.



In [420]:

    
# (d) Install Seaborn if it is not already installed and import it below. Then, perform a pairwise plot using the data.



In [421]:

    
# (e) Now try Seaborn's correlation plot (Heatmap).



In [422]:

    
# (Optional) Using the first 70% of the data as a training set and the last 30% as a test set, construct a classifier
# and see how well it performs. You will be certainly able to do this at the end of the course!

Question 5: Numpy Exercises



In [423]:

    
# (a) Write a function that takes in a tuple and a string that can either 
# be 'zero', 'one' or 'gaussian' and correspondingly return a NumPy array that contains those elements. For 'gaussian', 
# assume sampling with mean = 0, std = 1.



In [424]:

    
# (b) Write a function that returns a n x n identity matrix with n as a parameter.



In [425]:

    
# (c) Write a function that normalizes a matrix to [0, 1] and returns the normalized matrix.



In [426]:

    
# (d) Write code that creates a NumPy array and makes it immutable.



In [427]:

    
# (e) Write a function that finds the closest value to a given scalar s.



In [428]:

    
# (f) Write a function that subtracts the mean of each row from a matrix and returns it.



In [429]:

    
# (g) Write a function that sorts an array by the nth column and returns the sorted array.

Question 6: Numpy + First ML algorithm!



In [3]:

    
# (a) Write a function that implements Ordinary Least Squares given an input matrix X and a vector of targets y. 
# We will go over the method in the forthcoming lecture, but the equation is given in 
# https://en.wikipedia.org/wiki/Linear_regression#Estimation_methods 
# Note: Use NumPy here, but do NOT make use of library functions that do this for you.



In [ ]:



In [370]:

    
from sklearn.datasets import make_regression
X, y = make_regression(1000, n_features=1, noise=20, random_state=0)



In [371]:

    
# (b) Run your function on the above data and plot the data as well as the decision boundary (trendline) 
# generated by your classifer using matplotlib.



In [ ]: