In [30]:

    
from IPython.display import Image
Image(url='http://www.pythonbootcamp.info/_/rsrc/1280771545584/home/Screen%20shot%202010-08-02%20at%2010.51.14%20AM.png', embed=True)









    Out[30]:





<IPython.core.display.Image object>

Introduction

This course is meant to provide a very gentle introduction to the python scientific ecosystem. We will try to cover as much as humanly possible within the timespan of 2 hours. It is essentially a warm-up before the upcoming machine learning bootcamps for those who would like to adopt python as their daily data hacking toolbox.

What is in this coure

an introduction to python syntax,
an overview of the scientific libraries,
using the IPython Notebook
plotting within the notebook,
and a basic introduction to the machine library scikit-learn.

What this course is not about

an introduction to object oriented programming
detailed expertise in scientific python libraries
a machine learning practical course (come to thursday's bootcamp for that)

References

An excellent python tutorial

IPython Notebook

Python syntax

A comment is line preceded by a #



In [31]:

    
# Here is a comment

Numbers

Variables are declared without mentioning their type

Numbers



In [32]:

    
i = 5  # a is an integer
x = 2.0  # x is a float

Caveats



In [33]:

    
5. / 2









    Out[33]:





2.5



In [34]:

    
5 / 2









    Out[34]:





2



In [35]:

    
5. // 2  # floor division operator









    Out[35]:





2.0

Strings

Strings can be declared with two different ways (actually four...)



In [36]:

    
s1 = 'This is a string.'

s2 = '''This is a mult-line string.
        Awesome if I want to replace Microsoft Word by Python...
        or simply to document my code directly within ;)
     '''

s3 = "I'm using a double quotes to delimit the string, otherwise, I'd have to \"escape\" it with a \ ."



In [37]:

    
print(s1)
print(s2)
print(s3)









    



This is a string.
This is a mult-line string.
        Awesome if I want to replace Microsoft Word by Python...
        or simply to document my code directly within ;)
     
I'm using a double quotes to delimit the string, otherwise, I'd have to "escape" it with a \ .

You can know the type of a variable by calling the type() function, it comes out-of-box functions with your python interpreter. Many other functions are provided as we will see. We call them the built-in functions.



In [38]:

    
print(type(i))
print(type(x))
print(type(s1))









    



<type 'int'>
<type 'float'>
<type 'str'>

Tip

Variables can be assigned all at onces, which allows to do swapping pretty elegantly



In [39]:

    
a, b = 1, 's'
print(a, b)
a, b = b, a
print(a, b)









    



(1, 's')
('s', 1)



In [40]:

    
c = d = 'multiple assignment'



In [41]:

    
b += 10  # no b++ nor ++b in python

Indentation

Where C-like languages use curly brackets to delimit blocs of code, coditional statements, loops etc., python uses the code indentation.



In [42]:

    
i = 0
while i < 5:
    if i == 2:
        print('Two!')
    else:
        print(i)
    i += 1

Functions



In [43]:

    
def my_func(argument):
    """ This comment is called a docstring.
    Placed right after the function signature,
    it serves as a documentation for the function."""

    # the indentation is crucial
    print(argument)



In [44]:

    
my_func('hey you!')









    



hey you!



In [45]:

    
my_func(argument='out there in the cold')









    



out there in the cold



In [46]:

    
def my_func_2(argument='default argument'):
    print(argument)



In [47]:

    
my_func_2()









    



default argument



In [48]:

    
def my_func_3(a, b=10, c=20):
    print(sum((a, b, c)))



In [49]:

    
my_func_3(1)       # 1 + 10 + 20
my_func_3(1, 2)    # 1 + 2 + 20
my_func_3(1, 2, 3) # 1 + 2 + 3

Data structures

lists
dictionaries



In [50]:

    
my_list = [1, 2, 3]



In [51]:

    
def my_func_4(*args):
    print(sum(*args))



In [52]:

    
my_func_4(my_list)



In [53]:

    
my_func_3(*my_list)



In [54]:

    
my_dict = {'a': 1, 'b': 2, 'c': 3}



In [55]:

    
my_func_3(**my_dict)



In [56]:

    
def my_func_5(**kwargs):
    values = kwargs.values()
    print(sum(values))



In [57]:

    
my_func_5(**my_dict)

The standard signature for functions is commonly



In [58]:

    
def func(pos_arg_1, pos_arg_2, named_arg_1='default arg 1', named_arg_2='default arg 2', *args, **kwargs):
    pass

Data structures (most commonly used) operations



In [59]:

    
my_list.append('four')
my_list  # note that the list is a heterogenous container









    Out[59]:





[1, 2, 3, 'four']



In [60]:

    
my_list[0]









    Out[60]:





1



In [69]:

    
my_list[:2]









    Out[69]:





[1, 2]



In [70]:

    
my_list[1:3]









    Out[70]:





[2, 3]



In [79]:

    
my_list[-1]









    Out[79]:





7

Ex.

Display the last two elements of the list without explicitely mentioning the total lenght of the list.



In [61]:

    
my_list.extend([5, 6, 7])
my_list









    Out[61]:





[1, 2, 3, 'four', 5, 6, 7]



In [71]:

    
my_list[1:5:2]









    Out[71]:





[2, 'four']

One very handy and expressive way to create lists is to use list comprehensions



In [81]:

    
my_list_comprehension = [i**2 for i in range(10)]
my_list_comprehension









    Out[81]:





[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]



In [62]:

    
my_dict.keys()









    Out[62]:





['a', 'c', 'b']



In [63]:

    
my_dict.values()









    Out[63]:





[1, 3, 2]



In [64]:

    
my_dict.items()









    Out[64]:





[('a', 1), ('c', 3), ('b', 2)]



In [65]:

    
my_dict.update({'c': 3})
my_dict









    Out[65]:





{'a': 1, 'b': 2, 'c': 3}



In [77]:

    
my_dict['c']









    Out[77]:





3



In [78]:

    
my_dict['f']  # wrap it in a print to see the return value









    



None



In [78]:

    
my_dict.get('f')  # wrap it in a print to see the return value









    



None

The get() method accepts a second argument, check it out by pressing shift + tab when the cursor is over it. What you see is actually the docstring of the method (hence the importance of writing these little peaces of comments).



In [82]:

    
'f' in my_dict









    Out[82]:





False

Uncovered here

Dict comprehension
Sets
Tuples

Classes and objects



In [66]:

    
class MyClass:
    """ Class docstring """
    my_class_attribute = "I'm shared among all the instances."
    
    def __init__(self):
        """ Constructor """    
        self.my_instance_attribtue = None



In [67]:

    
my_object = MyClass()  # constructor with no arguments

The Python scientific environment

Numpy



In [94]:

    
import numpy as np

Meet the ndarray class



In [186]:

    
a = np.arange(20)
a, type(a)









    Out[186]:





(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19]), numpy.ndarray)



In [187]:

    
a.reshape(5, 4)









    Out[187]:





array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])



In [188]:

    
a = a.reshape(5, 2, 2)
a









    Out[188]:





array([[[ 0,  1],
        [ 2,  3]],

       [[ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15]],

       [[16, 17],
        [18, 19]]])



In [189]:

    
a.dtype









    Out[189]:





dtype('int64')



In [190]:

    
b = np.array([0.1, 0.2, 0.3])



In [191]:

    
b.dtype









    Out[191]:





dtype('float64')



In [192]:

    
b.astype(np.complex)









    Out[192]:





array([ 0.1+0.j,  0.2+0.j,  0.3+0.j])



In [193]:

    
a.shape, b.shape









    Out[193]:





((5, 2, 2), (3,))



In [194]:

    
print(a.ndim)
print(b.ndim)

3
1

Array creation



In [195]:

    
np.zeros( (4, 6) )









    Out[195]:





array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])



In [196]:

    
np.zeros_like(a)









    Out[196]:





array([[[0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0]],

       [[0, 0],
        [0, 0]]])



In [197]:

    
np.ones(10)









    Out[197]:





array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

Basic operations



In [198]:

    
c = np.arange(10).reshape(2, -1)
c









    Out[198]:





array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])



In [199]:

    
c + 10









    Out[199]:





array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])



In [200]:

    
c**2









    Out[200]:





array([[ 0,  1,  4,  9, 16],
       [25, 36, 49, 64, 81]])



In [201]:

    
np.sin(c)









    Out[201]:





array([[ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ],
       [-0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849]])



In [202]:

    
d = np.arange(10, 15)
d









    Out[202]:





array([10, 11, 12, 13, 14])



In [203]:

    
print('{} + {} = \n\n{}'.format(c, d, c+d))









    



[[0 1 2 3 4]
 [5 6 7 8 9]] + [10 11 12 13 14] = 

[[10 12 14 16 18]
 [15 17 19 21 23]]



In [204]:

    
c > 6









    Out[204]:





array([[False, False, False, False, False],
       [False, False,  True,  True,  True]], dtype=bool)



In [205]:

    
c[c > 6]









    Out[205]:





array([7, 8, 9])



In [211]:

    
c = c.T
c









    Out[211]:





array([[0, 5],
       [1, 6],
       [2, 7],
       [3, 8],
       [4, 9]])



In [213]:

    
c[ [1, 3, 4] ]









    Out[213]:





array([[1, 6],
       [3, 8],
       [4, 9]])



In [207]:

    
A = np.array( [[10  , 20],
               [30  , 40]] )
B = np.array( [[1   , 0.5],
               [1./3, 0.25]] )



In [208]:

    
A * B  # elementwise product









    Out[208]:





array([[ 10.,  10.],
       [ 10.,  10.]])



In [209]:

    
np.dot(A, B)  # matrix product









    Out[209]:





array([[ 16.66666667,  10.        ],
       [ 43.33333333,  25.        ]])

Copies and views, for lists and arrays



In [229]:

    
list_1 = range(10)
list_1









    Out[229]:





[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]



In [221]:

    
list_2 = list_1



In [222]:

    
id(list_1), id(list_2)









    Out[222]:





(4387029072, 4387029072)



In [223]:

    
list_1 is list_2









    Out[223]:





True



In [224]:

    
list_1 = list_2[:]



In [225]:

    
id(list_1), id(list_2)









    Out[225]:





(4387375512, 4387029072)



In [233]:

    
# equivalently...
list_1 is list_2









    Out[233]:





False



In [227]:

    
array_1 = np.arange(10)



In [236]:

    
array_2 = array_1



In [237]:

    
array_2 is array_1









    Out[237]:





True



In [231]:

    
array_2 = array_1[:]



In [232]:

    
array_2 is array_1









    Out[232]:





False



In [235]:

    
array_2[-1] = 100
array_1









    Out[235]:





array([  0,   1,   2,   3,   4,   5,   6,   7,   8, 100])



In [238]:

    
array_2 = array_1.view()



In [239]:

    
array_2 is array_1









    Out[239]:





False



In [240]:

    
array_2.base is array_1









    Out[240]:





True

If you want a deep copy, use the .copy() method.

For a more comprehensive introduction to numpy, you can read the official tutorial.

Plotting

Scikit-learn



In [241]:

    
from sklearn import datasets



In [243]:

    
from sklearn.ensemble import AdaBoostClassifier



In [242]:

    
digits = datasets.load_digits()



In [246]:

    
X, y = digits.data, digits.target



In [247]:

    
clf = AdaBoostClassifier(n_estimators=100)



In [248]:

    
clf.fit(X, y)









    Out[248]:





AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None,
          learning_rate=1.0, n_estimators=100, random_state=None)



In [250]:

    
clf.score(X, y)









    Out[250]:





0.42682248191430161



In [ ]:

Introduction

References

Table of Contents

IPython Notebook

Python syntax

Numbers

Numbers

Caveats

Strings

Tip

Indentation

Functions

Data structures

Data structures (most commonly used) operations

Ex.

Uncovered here

Classes and objects

The Python scientific environment

Numpy

Copies and views, for lists and arrays

Plotting

Scikit-learn