Python Crash Course

Master in Data Science - Sapienza University

Homework 2: Python Challenges

A.A. 2017/18

Tutor: Francesco Fabbri

Instructions

So guys, here we are! Finally you're facing your first REAL homework. Are you ready to fight? We're going to apply all the Pythonic stuff seen before AND EVEN MORE...

Simple rules:

Don't touch the instructions, you just have to fill the blank rows.

This is supposed to be an exercise for improving your Pythonic Skills in a spirit of collaboration so...of course you can help your classmates and obviously get a really huge help as well from all the others (as the proverb says: "I get help from you and then you help me", right?!...)

RULE OF THUMB for you during the homework:
- 1st Step: try to solve the problem alone
- 2nd Step: googling random the answer
- 3rd Step: ask to your colleagues
- 3rd Step: screaming and complaining about life
- 4th Step: ask to Tutors

And the Prize? The Beer?The glory?!:

Guys the life is hard...in this Master it's even worse... Soooo, since that you seem so smart I want to test you before the start of all the courses.

But not now.

You have to come prepared to the challenge, so right now solve these first 6 exercises, then it will be the time for FIGHTING and (for one of you) DRINKING.

Warm-up...

1. 12! is equal to...



In [19]:

    
for x in range(13):
    if x==0:
        fact = 1
    else:
        fact = fact * x
fact









    Out[19]:





479001600

2. More math...

Write a program which will find all such numbers which are divisible by 7 but are not a multiple of 5, between 0 and 1000 (both included). The numbers obtained should be printed in a comma-separated sequence on a single line. (range and CFS)



In [25]:

    
','.join([str(x) for x in range(1001) if (x%7 == 0) and (x%5 != 0)])









    Out[25]:





'7,14,21,28,42,49,56,63,77,84,91,98,112,119,126,133,147,154,161,168,182,189,196,203,217,224,231,238,252,259,266,273,287,294,301,308,322,329,336,343,357,364,371,378,392,399,406,413,427,434,441,448,462,469,476,483,497,504,511,518,532,539,546,553,567,574,581,588,602,609,616,623,637,644,651,658,672,679,686,693,707,714,721,728,742,749,756,763,777,784,791,798,812,819,826,833,847,854,861,868,882,889,896,903,917,924,931,938,952,959,966,973,987,994'

2. Count capital letters

In this exercises you're going to deal with YOUR DATA. Indeed, in the list below there are stored your Favorite Tv Series. But, as you can see, there is something weird. There are too much CaPITal LeTTErs. Your task is to count the capital letters in all the strings and then print the total number of capital letters in all the list.



In [28]:

    
tv_series = ['Game of THRroneS',
             'big bang tHeOrY',
             'MR robot',
             'WesTWorlD',
             'fIRefLy',
             "i haven't",
             'HOW I MET your mothER',
             'friENds',
             'bRon broen',
             'gossip girl',
             'prISon break',
             'breaking BAD']



In [39]:

    
# write here your code
import re
pattern = re.compile("[A-Z]")
capitals = 0
for c in ''.join(tv_series):
    if pattern.match(c):
        capitals+=1
capitals









    Out[39]:





34

3. A remark

Using the list above, create a dictionary where the keys are Unique IDs and values the TV Series. You have to do the exercise keeping in mind these 2 constraints:

The order of the IDs has to be dependent on the alphabetical order of the titles, i.e. 0: first_title_in_alphabetical_order and so on...

Solve the mess of the capital letter: we want them only at the start of the words ("prISon break" should be "Prison Break")



In [100]:

    
# write here your code
series_dict = {}
titled = [t.title() for t in tv_series]
for idx,movie_title in enumerate(sorted(titled)):
    series_dict[idx]=movie_title.title()
series_dict









    Out[100]:





{0: 'Big Bang Theory',
 1: 'Breaking Bad',
 2: 'Bron Broen',
 3: 'Firefly',
 4: 'Friends',
 5: 'Game Of Thrrones',
 6: 'Gossip Girl',
 7: 'How I Met Your Mother',
 8: "I Haven'T",
 9: 'Mr Robot',
 10: 'Prison Break',
 11: 'Westworld'}

4. Dictionary to its maximum

Invert the keys with the values in the dictionary built before.



In [60]:

    
# write here your code
series_dict = dict(zip(series_dict.values(),series_dict.keys()))
series_dict









    Out[60]:





{'Big Bang Theory': 5,
 'Breaking Bad': 6,
 'Bron Broen': 4,
 'Firefly': 7,
 'Friends': 8,
 'Game Of Thrrones': 0,
 'Gossip Girl': 9,
 'How I Met Your Mother': 1,
 "I Haven'T": 10,
 'Mr Robot': 2,
 'Prison Break': 11,
 'Westworld': 3}

Have you done in one line of code? If not, try now!



In [61]:

    
# write here your code
print(dict(zip(series_dict.values(),series_dict.keys())))









    



{0: 'Game Of Thrrones', 1: 'How I Met Your Mother', 2: 'Mr Robot', 3: 'Westworld', 4: 'Bron Broen', 5: 'Big Bang Theory', 6: 'Breaking Bad', 7: 'Firefly', 8: 'Friends', 9: 'Gossip Girl', 10: "I Haven'T", 11: 'Prison Break'}

4. Other boring math

Let's talk about our beloved exams. Starting from the exams and CFU below, are you able to compute the weighted mean of them? Let's do it and print the result.

Description of the data:

exams[1] = $(title_1, grade_1)$

cfu[1] = $CFU_1$



In [4]:

    
exams = [('BIOINFORMATICS', 29),
         ('DATA MANAGEMENT FOR DATA SCIENCE', 30),
         ('DIGITAL EPIDEMIOLOGY', 26),
         ('NETWORKING FOR BIG DATA AND LABORATORY',28),
         ('QUANTITATIVE MODELS FOR ECONOMIC ANALYSIS AND MANAGEMENT','30 e lode'),
         ('DATA MINING TECHNOLOGY FOR BUSINESS AND SOCIETY', 30),
         ('STATISTICAL LEARNING',30),
         ('ALGORITHMIC METHODS OF DATA MINING AND LABORATORY',30),
         ('FUNDAMENTALS OF DATA SCIENCE AND LABORATORY', 29)]

cfu = sum([6,6,6,9,6,6,6,9,9])



In [5]:

    
cfu_ex = [6,6,6,9,6,6,6,9,9]



In [8]:

    
# write here your code
grades_sum = 0
mean = 0
for idx,w in enumerate(cfu_ex):
    try:
        mean += int(exams[idx][1])/cfu*w
        grades_sum += int(exams[idx][1])
    except ValueError:
        if exams[idx][1] == "30 e lode":
            mean += 30/cfu*w
            grades_sum += 30
        else:
            pass
    
print(mean)









    



29.095238095238095
262

5. Palindromic numbers

Write a script which finds all the Palindromic numbers, in the range [0,N] (bounds included). The numbers obtained should be printed in a comma-separated sequence on a single line.

What is N? Looking at the exercise before: N = (Total number of CFU) x (Sum of all the grades)

(details: https://en.wikipedia.org/wiki/Palindromic_number)



In [27]:

    
N = cfu * grades_sum
palindromic = ','.join([str(x) for x in range(N+1) if list(str(x)) == list(reversed(list(str(x))))])
print(palindromic)









    



0,1,2,3,4,5,6,7,8,9,11,22,33,44,55,66,77,88,99,101,111,121,131,141,151,161,171,181,191,202,212,222,232,242,252,262,272,282,292,303,313,323,333,343,353,363,373,383,393,404,414,424,434,444,454,464,474,484,494,505,515,525,535,545,555,565,575,585,595,606,616,626,636,646,656,666,676,686,696,707,717,727,737,747,757,767,777,787,797,808,818,828,838,848,858,868,878,888,898,909,919,929,939,949,959,969,979,989,999,1001,1111,1221,1331,1441,1551,1661,1771,1881,1991,2002,2112,2222,2332,2442,2552,2662,2772,2882,2992,3003,3113,3223,3333,3443,3553,3663,3773,3883,3993,4004,4114,4224,4334,4444,4554,4664,4774,4884,4994,5005,5115,5225,5335,5445,5555,5665,5775,5885,5995,6006,6116,6226,6336,6446,6556,6666,6776,6886,6996,7007,7117,7227,7337,7447,7557,7667,7777,7887,7997,8008,8118,8228,8338,8448,8558,8668,8778,8888,8998,9009,9119,9229,9339,9449,9559,9669,9779,9889,9999,10001,10101,10201,10301,10401,10501,10601,10701,10801,10901,11011,11111,11211,11311,11411,11511,11611,11711,11811,11911,12021,12121,12221,12321,12421,12521,12621,12721,12821,12921,13031,13131,13231,13331,13431,13531,13631,13731,13831,13931,14041,14141,14241,14341,14441,14541,14641,14741,14841,14941,15051,15151,15251,15351,15451,15551,15651,15751,15851,15951,16061,16161,16261,16361,16461

6. StackOverflow

Let's start using your new best friend. Now I'm going to give other task, slightly more difficult BUT this time, just googling, you will find easily the answer on the www.stackoverflow.com. You can use the code there for solving the exercise BUT you have to understand the solution there COMMENTING the code, showing me you understood the thinking process behind the code.

6. A

Show me an example of how to use PROPERLY the Try - Except statements



In [34]:

    
# write here your code
from random import randint
while True:
    try:
        randint(0, 9) / randint(0, 9) # execute some random division
    except ZeroDivisionError:         # until a division by zero occurs
        break

6. B

Giving this list of words below, after copying in a variable, explain and provide me a code for obtaining a Bag of Words from them. (Hint: use dictionaries and loops)

['theory', 'of', 'bron', 'firefly', 'thrones', 'break', 'bad', 'mother', 'firefly', "haven't", 'prison', 'big', 'friends', 'girl', 'westworld', 'bad', "haven't", 'gossip', 'thrones', 'your', 'big', 'how', 'friends', 'theory', 'your', 'bron', 'bad', 'bad', 'breaking', 'met', 'breaking', 'breaking', 'game', 'bron', 'your', 'breaking', 'met', 'bang', 'how', 'mother', 'bad', 'theory', 'how', 'i', 'friends', "haven't", 'of', 'of', 'gossip', 'i', 'robot', 'of', 'prison', 'bad', 'friends', 'friends', 'i', 'robot', 'bang', 'mother', 'bang', 'i', 'of', 'bad', 'friends', 'theory', 'i', 'friends', 'thrones', 'prison', 'theory', 'theory', 'big', 'of', 'bang', 'how', 'thrones', 'bang', 'theory', 'friends', 'game', 'bang', 'mother', 'broen', 'bad', 'game', 'break', 'break', 'bang', 'big', 'gossip', 'robot', 'met', 'i', 'game', 'your', 'met', 'bad', 'firefly', 'your']



In [44]:

    
# write here your code
corpus = ['theory', 'of', 'bron', 'firefly', 'thrones', 'break', 'bad', 'mother', 'firefly', "haven't", 'prison', 'big', 'friends', 'girl', 'westworld', 'bad', "haven't", 'gossip', 'thrones', 'your', 'big', 'how', 'friends', 'theory', 'your', 'bron', 'bad', 'bad', 'breaking', 'met', 'breaking', 'breaking', 'game', 'bron', 'your', 'breaking', 'met', 'bang', 'how', 'mother', 'bad', 'theory', 'how', 'i', 'friends', "haven't", 'of', 'of', 'gossip', 'i', 'robot', 'of', 'prison', 'bad', 'friends', 'friends', 'i', 'robot', 'bang', 'mother', 'bang', 'i', 'of', 'bad', 'friends', 'theory', 'i', 'friends', 'thrones', 'prison', 'theory', 'theory', 'big', 'of', 'bang', 'how', 'thrones', 'bang', 'theory', 'friends', 'game', 'bang', 'mother', 'broen', 'bad', 'game', 'break', 'break', 'bang', 'big', 'gossip', 'robot', 'met', 'i', 'game', 'your', 'met', 'bad', 'firefly', 'your']
BoW = {}
for idx,word in enumerate(corpus):
    BoW[word] = corpus.count(word)
    
BoW









    Out[44]:





{'bad': 9,
 'bang': 7,
 'big': 4,
 'break': 3,
 'breaking': 4,
 'broen': 1,
 'bron': 3,
 'firefly': 3,
 'friends': 8,
 'game': 4,
 'girl': 1,
 'gossip': 3,
 "haven't": 3,
 'how': 4,
 'i': 6,
 'met': 4,
 'mother': 4,
 'of': 6,
 'prison': 3,
 'robot': 3,
 'theory': 7,
 'thrones': 4,
 'westworld': 1,
 'your': 5}

6. C

And now, write down a code which computes the first 10 Fibonacci numbers

(details: https://en.wikipedia.org/wiki/Fibonacci_number)



In [57]:

    
# write here your code
fibonacci = []
for x in range(10):
    if x<2:
        fibonacci.append(1)
    else:
        fibonacci.append(fibonacci[x-1] + fibonacci[x-2])
    
fibonacci









    Out[57]:





[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]



In [ ]: