Learning Python, Numpy and Deep Learning

Alex Brie, 29/06/2017

Chapter 1: my first Jupyter Notebook

First, baby steps:

For the first self-taught lesson I just wanted to understand how to use the Jupyter notebook software, how to open a dataset (csv) and how to apply basic operations on it (filtering, basic numpy methods)

For dataset I didn't want just any dataset but a government one. Therefore I'm using a csv that contains the number of vaccines administered to children in the first trimester of 2017, in Romanian cities data.gov.ro

Conclusion (later edit)

The good part: I'm able to use Jupyter, with autocomplete(after installing readline), look at documentation in a separate terminal using pydoc, create new cells, run them, open a csv, convert it into numpy array and then do basic operations on it such as filtering, etc.

The bad part: I didn't do anything that you can't do in 10 seconds by simply opening the aforementioned csv in Excel. Plus, my filtering probably sucks. But it's a start.



In [39]:

    
import pandas as pd
import numpy as np



In [40]:

    
datas = pd.read_csv('copii.csv', sep=';', names=["J","L","V", "N", "A"])



In [41]:

    
print(datas[0:10])
# print(datas.columns)









    



      J        L                   V   N     A
0  Alba    Abrud  BCG ( Alt produs )  10  2017
1  Alba    Abrud            Diftavax  23  2017
2  Alba    Abrud           Engerix B   5  2017
3  Alba    Abrud               Euvax   4  2017
4  Alba    Abrud            Hexacima  25  2017
5  Alba    Abrud       Infanrix hexa   3  2017
6  Alba    Abrud         M-M-RVAXPRO  26  2017
7  Alba    Abrud            Tetraxim  16  2017
8  Alba  Acmariu            Diftavax   2  2017
9  Alba  Acmariu         M-M-RVAXPRO   1  2017



In [42]:

    
np_datas = np.array(datas)



In [43]:

    
judete = np_datas[:,0]
orase = np_datas[:, 1]
vaccinuri = np_datas[:, 2]
cantitati = np_datas[:, 3]
ani = np_datas[:, 4]

Test printing a filter



In [44]:

    
print(orase[judete=="Prahova"])
print(judete[orase=="Busteni"])









    



['Adunati' 'Adunati' 'Adunati' ..., 'Zamfira' 'Zamfira' 'Zanoaga']
['Dolj' 'Dolj' 'Prahova' 'Prahova' 'Prahova' 'Prahova' 'Prahova' 'Prahova'
 'Prahova' 'Prahova']

Prepare filter columns that allow us to select any county/town combo, or county/town/vaccine name combo



In [45]:

    
jud_or = judete + "_"+ orase
jud_or_vac = judete + "_"+ orase + "_"+ vaccinuri



In [46]:

    
print(np.sum(cantitati[jud_or=="Prahova_Busteni"]))
print(np.sum(cantitati[jud_or_vac=="Bucuresti_Sector 2_Hexacima"]))

Demo for extracting the quantities and vaccines names for a given county_city combo



In [47]:

    
na = np.array([vaccinuri[jud_or=="Prahova_Busteni"], cantitati[jud_or=="Prahova_Busteni"]]).T



In [48]:

    
print(na)









    



[['BCG ( Alt produs )' 18]
 ['Diftavax' 2]
 ['Euvax' 10]
 ['Hexacima' 33]
 ['M-M-RVAXPRO' 39]
 ['Priorix' 1]
 ['Tetraxim' 21]
 ['VVR' 1]]



In [49]:

    
print (np.sum(na[:,-1]))

Identify the most common/popular/abundant vaccine



In [50]:

    
max_vac = np.max(na[:,-1])
vaccine_name = na[na[:,-1]==max_vac][0][0]
print(vaccine_name)









    



M-M-RVAXPRO

Now for the least common/popular/abundant vaccine



In [51]:

    
min_vac = np.min(na[:,-1])
vaccine_name = na[na[:,-1]==min_vac][0][0]
print(vaccine_name)









    



Priorix



In [ ]: