Why are you here ?

== Why is Python getting famous ?

== What are the benefits of Python ?

Programming is not only about solving problems at all. It is also about:

  • Effiency : How long do I need to deliver a bug free implementation ?
  • Extendability: What happens if I or others want to modify the implementation ?

General characteristics of Python:

  • clean and simple language: Easy-to-read and intuitive code, easy-to-learn minimalistic syntax

  • expressive language: Fewer lines of code, fewer bugs, easier to maintain.

  • general-purpose language in contrast to e.g. Matlab (numerics), PHP (websites), R (statstics). No need to mix languages for a full application, which gets data from a data base and provides a web service, or is integrated in a web service

Technical details:

  • dynamically typed: No need to define the type of variables, function arguments or return types.

  • automatic memory management: No need to explicitly allocate and deallocate memory for variables and data arrays. No memory leak bugs.

  • interpreted: No need to compile the code. The Python interpreter reads and executes the Python code directly.

The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.

Other principles:

More facts:

  • invented in 1991 by Guido van Rossum, a fan of Monty Pythons Flying Circus
  • Current versions are 2.7 and 3.4 which are not compatible any more. This course is about 2.7. But the next one will be 3.4
  • Python ships with many versatile modules, e.g. for math, file system access, web service access, etc. This is often quoted as Python comes with batteries included

Python's fields of application:

  • text processing
  • glueing other applications
  • web frameworks, eg Django, Pyramid, Flask, Plone
  • administrative tasks
  • (R replacement) data analyis aka 'big data'
  • (matlab replacement) numerics, simulations, ...
  • (everything else)

Installing Python on Windows

Installing Python on Ubuntu/Debian Linux

Global install of Python interpreter, you need root access as super user:

$ sudo apt-get install python2.7 python2.7-dev $ python -V Python 2.7.3 # or similar 2.7.X $ sudo apt-get install python-qt4 $ sudo apt-get install python-setuptools $ easy_install pip $ easy_install virtualenv

Create a local isolated version of Python. This keeps your laptop clean and avoids version conflicts !

$ cd $HOME $ virtualenv python_kurs

Activate this isolated version as follows, you have to to this after each install before you start using this isolated environment !

$ cd python_kurs $ . bin/activate (python_kurs) $

Now we install a local version of needed python packages:

(python_kurs) $ pip install pygments spyder (python_kurs) $ spyder

Spyder startup

1. Open Spyder (Windows: see start menu, others: from command line with "$ spyder") 2. Use Spyders explorer (top right window, may be hidden in tabs) to create a "folder python_course_examples" and select this folder. 3. Menu: Interpreters -> Open IPython console 4. Enter "pwd" in IPython console and check if the new created directory is printed.

Edit / Execute workflow

1. create a new Python module within the directory created above (right click in file explorer, "new module") 2. enter "print 42" in the code editor 2. Use F5 to execute the script in the shell, take care to choose "open in existing interpreter". You should see the output from the print statement.
Python is not just an interpreter, it has a shell for trying things out and for learning. Most examples you will see here can be executed in a shell.

Console Input / Output


In [86]:
print "hello word"  # this is a comment


hello word
commata separate output, comma on end of print statement supresses new line

In [88]:
print 3, 4,
print 5


3 4 5
Note: in Python 3.X "print" is a function, so you have to put paranthesis around the arguments

In [92]:
name = raw_input("your name: ")   # strange name for input function
print "Hello", name, "!!!"


your name: uwe
Hello uwe !!!
TAKE CARE: Python also has a function "input" which is different from "raw_input".

Variables

VARIABLES ARE NAMES FOR OBJECTS other languages have concepts as pointers, references etc. keep this sentence in mind, it will simplify the deeper understanding how things work in Python.
Variables in Python are just assigned, no type declaration needed !

In [93]:
pi = 3.141
my_name = "uwe schmitt"
Nevertheless these variabes have a type:

In [94]:
type(pi)


Out[94]:
float

Valid variable names

allowed are: alphabetic characters, digits and underscore "_" disallowed: digit in first place disallowed: names which refer to Python statements as "print"

In [95]:
# examples for valid names
a = 123       
a_123 = 2
_xyz = 2

In [96]:
aBcD = 4      # uppercase matters
abcd = 5
aBcD == abcd


Out[96]:
False
Invalid variable names:

In [97]:
1abc = 3


  File "<ipython-input-97-26b7134673f5>", line 1
    1abc = 3
       ^
SyntaxError: invalid syntax

In [98]:
print = 4


  File "<ipython-input-98-15218bea882c>", line 1
    print = 4
          ^
SyntaxError: invalid syntax
Note: most Python programmers personally prefer "_" separation in long variable names instead of 'CamelCase'. Check for readability:

In [99]:
this_is_a_long_variable_name = 3

thisIsALongVariableName = 4
For later reading: http://www.python.org/dev/peps/pep-0008/

In [100]:
# some forbidden variable names:
# id, str, type, input, list, file, ...
# often used solution in python community: add a "_"
id_ = 3
type_ = "int"
print_ = 0

Calculating with numbers

Python has integer numbers, floating point and even complex numbers. Basic operations as in other languages:

In [101]:
print 32 * (41 + 1) / 7


192
two different types of division: floating point division: as you would expect

In [102]:
print 29.0 / 2


14.5
integer division: rounds down to next smallest integer

In [103]:
print 31 / 7


4
alternative:

In [104]:
print 31 // 7


4
Note: in Python 3.x you have to use "//" for integer division !
modulo a.k.a. division reminder:

In [105]:
print 31 % 7


3
Divion by zero:

In [106]:
print 7 / 0


---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-106-ead4a91e8906> in <module>()
----> 1 print 7 / 0

ZeroDivisionError: integer division or modulo by zero
Python has classical 32 or 64 bit integers, but switches to "long integers" in case of overflow. Those "long integers" in Python may be as large as they fit into memory. No suprise = no astonishment = no incidental bug.

In [107]:
3**117


Out[107]:
66555937033867822607895549241096482953017615834735226163L

can be used for calculating pi up to many digits: http://www.craig-wood.com/nick/articles/pi-machin/

More math than just addition, et al:

load special module, ships with Python, is one of the "included batteries":

In [108]:
import math
methods and constants are "attached" to math module:

In [109]:
print math.pi


3.14159265359

In [110]:
print math.cos(math.pi * 2.0)


1.0

In [111]:
print math.sqrt(121.0)


11.0

In [112]:
print math.pow(2, 4)


16.0

In [113]:
print 2**4   # alternative to math.pow, preserves type


16
to explorer content of math module:

In [114]:
print dir(math)


['__doc__', '__name__', '__package__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'hypot', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc']
alternative: enter "math." in IPython shell
to show help:

In [115]:
help(math.log1p)


Help on built-in function log1p in module math:

log1p(...)
    log1p(x)
    
    Return the natural logarithm of 1+x (base e).
    The result is computed in a way which is accurate for x near zero.

alternative kinds of import:

In [116]:
from math import sin
print sin(3)


0.14112000806
should be avoided, clutters namespace, error prone, eg if one unses math and numpy module in one script:

In [117]:
from math import *

In [118]:
import math as m
print m.sin(0)


0.0
PRACTICE TIME Write a scripts for answering the following questions: 1. How many rice corns do you have on a checker board if you put on corn on the first field and double the number of corns from each field to the next ??? HINT: geometric sum: 1 + a^1 + a^2 + ... a^n = (a^(n+1)-1)/(a-1) An average rice corn weights 25 mg, how many kg do you have on the board ? The earth has a weight of 5.972E24 kg, put this into relation to the weight of the rice Use variables for the intermediate results ! 2. Calculate the area of a circle with diameter 21.0 cm Calculate the diameter of a circle with area 1.0 cm^2

Some special notes

floating point math precision is limited:

In [119]:
.25 - .2


Out[119]:
0.04999999999999999
type coercion: finds common type when calcaulating with different types

In [120]:
type(1 + 3)


Out[120]:
int
the decimal point indicates floating point numbers:

In [121]:
type(1.0 + 3)


Out[121]:
float

complex numbers

complex unit is "j", not "i". "j" is often used in electrical engineering

In [122]:
z = 1 + 0.8j
common operations:

In [123]:
print z, abs(z), z.imag, z.real, z.conjugate()


(1+0.8j) 1.28062484749 0.8 1.0 (1-0.8j)
same algebraic operations as for floats and ints:

In [124]:
print z + z, z * z, z**3


(2+1.6j) (0.36+1.6j) (-0.92+1.888j)
POSSIBLE SOURCE OF ERROR: pure imaginary numbers must be declared as below to avoid clash with a variable named "j"

In [125]:
z = 1j
print z*z


(-1+0j)
extra module 'cmath' for functions as sin, cos on complex numbers, math module does not work:

In [130]:
math.sin(1j)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-130-bf9d45ed73bb> in <module>()
----> 1 math.sin(1j)

TypeError: can't convert complex to float

In [131]:
import cmath
cmath.sin(1j)


Out[131]:
1.1752011936438014j

Logical values, Comparison operators

logical values represent logical state, that is "True" of "False" "==" is used for testing equality, do not confuse with "=" which is assignment:

In [132]:
print 3 == 4


False

In [133]:
print 3 = 4    # would be valid in C/C++ !!!


  File "<ipython-input-133-e8f8a9176a62>", line 1
    print 3 = 4    # would be valid in C/C++ !!!
            ^
SyntaxError: invalid syntax
!= is the test for inequality:

In [134]:
print 3 != 4


True
tests for different kinds of ordering:

In [135]:
print 3 >= 4, 3<=4, 3 < 4, 3 > 4


False True True False
logical operations written in "plain text":

In [136]:
print not not True


True

In [137]:
print 3 < 4 and 3 >= 4


False

In [138]:
print 3 < 4 or 3 >= 4


True
side note: & and | are bitwise operations:

In [139]:
print 6 & 12


4

In [140]:
print 6 | 12


14

Strings represent text

Python provides four alternatives for declaring a string constant:

In [141]:
print "Python"


Python

In [142]:
print 'Python'


Python
two variants which allow strings over muliple lines:

In [143]:
print """Python:
programming done easy"""


Python:
programming done easy

In [144]:
print '''Python
programming done easy'''


Python
programming done easy
advantage: you can use quotes inside a string without using escaping as done in C/C++:

In [145]:
print "'abc' is a string"


'abc' is a string

In [146]:
print '"abc" is a string'


"abc" is a string

In [147]:
print """those are strings: "abc" and 'abc'"""


those are strings: "abc" and 'abc'

Comparing strings


In [148]:
print "abc" == "abc"


True

In [149]:
print "abc" != "ABC"


True
strings are ordered lexycographically (phone book ordering):

In [150]:
print "abc" < "abe"


True

In [151]:
print "abc" < "abcd"


True

Indexing: accessing single characters in a string

indexing is used to access single characters, counting starts with zero Matlab users: be prepared !

In [152]:
name = "Monty_Python"
print name[0]


M
negative indices start from the end. This convenient as you do not need to know the length of the string:

In [153]:
print name[-1]


n

In [154]:
print name[-2]


o

In [155]:
print name[-12] == name[0]


True

Slicing: beyond indexing

expression like "string[start:end]" is called slicing end is exclusive !!!

In [160]:
"01234"[1:4]


Out[160]:
'123'
extended form : "string[start:end:stepsize]" stepsize is called "stride" start:end is the same as start:end:1

In [163]:
print "0123456789"[1:8:2]


1357
default cases: you can omit values m, n or k, default for k is 1, default for m is 0 (if step size k is positive), default for n is the length of the string (if step size k is positive)

In [164]:
name = "Monty_Python"
print name[:7]


Monty_P

In [165]:
print name[7:]


ython

In [169]:
print name[3:-2:2]


t_yh

In [170]:
print name[-2:3:-2]


otPy

In [171]:
print name[:]


Monty_Python
This slicing has two nice properties: 1. for a slice m:n the legth of the sliced string is just m - n. (see KISS principle)

In [172]:
len(name[3:7]) == 7 - 3


Out[172]:
True
2. concatenating slices is easy to interpret

In [173]:
name[3:7] + name[7:11] == name[3:11]


Out[173]:
True

String methods

methods = functions "attached" to "object":

In [175]:
print name.upper()


GUIDO
Note: strings are never changed inplace, all operations for changing a string return a new string ! aka STRINGS ARE IMMUTABLE

In [177]:
print name


Guido

In [178]:
# first occurence of "ui" in "Guido":
print "Guido".find("ui")


1
PRACTICE 1. Use pen and paper to determine the values of the following expressions, the use the Python shell to validate your results: "abcdefghi"[2:-2] "abcdefghi"[3:8:2] "abcdefghi"[7:4:-1] "abcdefghi"[::2] "abcdefghi"[::-1] "abcdefghi"[:7:3] "abcdefghi"[2::2] 2. Look up what the "replace" method for strings does, and use this to write a script which starts with a string "123" and then transforms this to "one_23" and then to "one_two_three_" and then to "one_two_three" ! 3. Look up string "endswith" and "startswith" methods. Do you see any benefit compared to string comparison with "==" ??? 4. Enter the snippet as a script and try to explain its output: number = raw_input("please input number: ") print number, "times 3 is", number * 3

'Calculating' with strings


In [179]:
name = "uwe"
action = "says hello"
print name +  " " + action + "!"   # "+" concatenates strings


uwe says hello!

In [180]:
# strings can be multplied with an integer
print "123_" * 4


123_123_123_123_

In [181]:
# the position of the integer factor does not matter
# addition is something we have seen above
print 2 * "123_" + "abc"


123_123_abc

String to number conversion


In [185]:
number = raw_input("please input number: ")
print number, "times 3 is", number * 3


please input number: 1
1 times 3 is 111
float() and int() can be used to convert a string representing a number to the numbers value:

In [189]:
# correct
number = raw_input("please input number: ")
print number, "times 3 is", float(number) * 3


please input number: 1
 1 times 3 is 3.0

In [190]:
print int("123") * float("3.141")


386.343

In [191]:
print int("a3")


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-191-7d7f3234561e> in <module>()
----> 1 print int("a3")

ValueError: invalid literal for int() with base 10: 'a3'

Variables, interlude

We know some Python types already:

In [192]:
print type(3)


<type 'int'>

In [193]:
print type(3.0)


<type 'float'>

In [194]:
print type("")


<type 'str'>

In [195]:
print type(True)


<type 'bool'>
Note for C/C++ programmers: variables can be reassigned, in the case below "hi" does not have a name any more and occoupied memory will be deallocated from memory automatically.

In [197]:
a = "hi"
a = a + " there"
print a


hi there
We already know that we do not need to declare a variables type, but beyond that, this type is allowd to change during execution:

In [198]:
a = 123
print type(a), a
a = "i am a"
print type(a), a


<type 'int'> 123
<type 'str'> i am a

For loops

Simple loop counting 0 up to n-1 is done with a range(n), the upper limit of this range is exclusive, counting starts with zero.

In [199]:
for i in range(3): # aka for (int i=0; i<3; i++) in C/C++
    print i


0
1
2
Code blocks in Python do not need any braces or some "begin" and "end" statements. Lines having the same identation belong to the same block of code. Most Python programmers use 4 spaces, no tab.

In [200]:
for i in range(3):
    print "i=",
    print i


i= 0
i= 1
i= 2

In [201]:
for i in range(3):
    print "i=",
   print i


  File "<ipython-input-201-f44eae0d3110>", line 3
    print i
           ^
IndentationError: unindent does not match any outer indentation level
The range function is similar to slicing:

In [202]:
for i in range(1, 5):
    print i


1
2
3
4

In [203]:
for i in range(1, 5, 2):
    print i


1
3
Loops can be nested by further identation:

In [204]:
for i in range(1, 3):      
    for j in range(2, 4):  
        print i, "times", j, "is", i*j


1 times 2 is 2
1 times 3 is 3
2 times 2 is 4
2 times 3 is 6

Control Flow

If / then / else

There is no "else if" in Python, only "elif". Similar to the "for" expression lines end with a colon:

In [208]:
number = int(raw_input("input number: "))

if number < 0:
    print "number is negative"
elif number == 0:
    print "number is zero"
else:
    print "number is positive"


input number: 3
number is positive

While statement

Python only has a "while", no "do until", or "do while" where the loop continuation is checked at the end of the loop:

In [209]:
x = 3
while x > 0:
    print x
    x -= 1


3
2
1

Breaking out of a loop


In [210]:
for i in range(10):
    print i
    if i > 1:
        print "stop"
        break


0
1
2
stop

Skipping execution of rest of loop body with "continue"


In [211]:
for i in range(5):
    print i,
    if i % 2 == 0:
        print "is even"
        continue
    print "is odd"


0 is even
1 is odd
2 is even
3 is odd
4 is even
PRACTICE: 1) Write a script which asks for two numbers a, b and calculates a^2 + a^4 + .. + a^(2b). You will need to write a for loop. Try to write two different solutions, using different stepsizes for the exponent. 2) write a program which asks for two numbers, and let the user choose an operation +, -, *, / apply this operation to the two numbers and print the result 3) extension: after each computation, ask if the user wants to do another calculation, and handle this accordingly 4) let the user input a number and check if this is a prime number

Python container types: lists

The range function introduced above returns something called "a list":

In [212]:
print type(range(3))


<type 'list'>
For loops is in most cases not just do counting up to a limit, but "iterating" over an "object you can iterate over".

In [213]:
for i in [3, 4, 7, 2]:
    print i, "squared is", i*i


3 squared is 9
4 squared is 16
7 squared is 49
2 squared is 4
Side Note: you can iterate over strings too:

In [218]:
for ci in "Python":
    print ci, chr(ord(ci)+1)  # ord returns ascii code of character, chr is inverse


P Q
y z
t u
h i
o p
n o
The values in a list may have arbitrary type:

In [219]:
print [3, "hello", 3.12]


[3, 'hello', 3.12]
To check if an element is in a list, use "in"

In [220]:
print 3 in [1, 2, 4]


False
Good practice: "not in" is more readable than "if not ... in ...":

In [223]:
print 3 not in [1, 2, 4]


True
the "append" method adds an element to a list:

In [224]:
numbers = [1, 3, 2]
numbers.append(4)
print numbers


[1, 3, 2, 4]
list support slicing:

In [225]:
print numbers[1:3]


[3, 2]
"len" returns the lenght of a list:

In [226]:
print len(numbers)


4
[] is the empty list:

In [227]:
print type([]), len([])


<type 'list'> 0
list "algebra": adding and multiplication with lists

In [228]:
print [0, 11] * 3


[0, 11, 0, 11, 0, 11]

In [229]:
print [1, 2, 3] + [5, 6]


[1, 2, 3, 5, 6]
Note: contrary to strings, lists are mutable, so many list methods change the list in-place.

In [230]:
print numbers


[1, 3, 2, 4]

In [231]:
numbers[0] = 2
print numbers


[2, 3, 2, 4]

In [232]:
numbers.remove(2)
print numbers


[3, 2, 4]
Example:

In [233]:
li = [3, 4, 2]
# remove all even numbers:
for i in li:
    if i % 2 == 0:
        li.remove(i)
print li


[3, 2]

In [234]:
# PROBLEM: list is changed during execution
li = [3, 4, 2]
# remove all even numbers, iterate over a copy of li:
for i in li[:]:
    if i % 2 == 0:
        li.remove(i)
print li


[3]
PRACTICE TIME 1. Create a list and use the IPython shell to find out how to sort the list. How can you sort in reversed order ? 2. Let the user input a number > 3. Create a list with the Fibonacci numbers up to this value. [start with [1, 1], then append iteratively the sum of its two predecessors. That is you get a sequence 1, 1, 2, 3, 5, 8, 11, 19, ....] 3. Write a script which takes a number of values and prints those numbers concatonated without spaces. hint: fist transform [1,3,12] -> ["1", "3", "12", then lookup help("".join)

Container types: tuples

Tuples are defined as lists, but with parantheses instead of square brackets:

In [235]:
for i in (1, 2, 3):
    print i


1
2
3
Contrary to lists, tuples are immutable:

In [236]:
tp = (1, 2, 3)
tp.append(4)


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-236-80b2ba7e971d> in <module>()
      1 tp = (1, 2, 3)
----> 2 tp.append(4)

AttributeError: 'tuple' object has no attribute 'append'
Although lists can contain values of different types, most Python programmers prefer tuples for grouping objects of different types.
Slicing again......

In [237]:
tp = (1, 2, 3, 4)
print tp[-3:-1]


(2, 3)

Tuple unpacking

Values grouped in a tuple can be "ungrouped":

In [238]:
tp = (1, 2, 3)
a, b, c = tp
print a * b * c


6
If the Python tries to interprete values separated by commata as tuples, so most tuples can be written without paranthesis:

In [239]:
a, b, c = 1, 2, 3

In [240]:
x = 1,
print type(x), len(x)


<type 'tuple'> 1
() is the empty tuple:

In [241]:
print type(()), len(())


<type 'tuple'> 0
tuple unpacking is convenient for swapping and other permutations of values. No temporary variables are needed:

In [242]:
a, b = b, a
print a, b


2 1
This pairing of two lists is often needed. Python provides the zip function:

In [243]:
a_values = [1, 3 ,5]
b_values = [2, 4, 6]
print zip(a_values, b_values)


[(1, 2), (3, 4), (5, 6)]
tuple upacking works in many places, eg in for loops:

In [244]:
for a, b in zip(a_values, b_values):
    print a + b


3
7
11
Tuple unpacking can be used when enumerating an "iterable":

In [245]:
words = ["you", "me", "and", "i"]
for i, word in enumerate(words):
    print "word", i, "is", word


word 0 is you
word 1 is me
word 2 is and
word 3 is i

Python collections: dictionaries

Dictionaries represent a mapping of keys to values. Also known as "Map", "HashMap", etc in C++ or Java. Dictionaries are defined like this:

In [246]:
age_of = { "jan": 18, "alan": 27, "universe": 13.8e9 }

This corresponds to the following table:

key mapped to
jan 18
alan 27
universe 13.8e9
To access the value assigned to a key square brackets are used:

In [247]:
print age_of["universe"]


13800000000.0
As there is no key "god" in age_of, this does not work:

In [248]:
print age_of["god"]


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-248-a6fca65d74ed> in <module>()
----> 1 print age_of["god"]

KeyError: 'god'
But Python has a dictionary method for providing default values for undefined keys:

In [249]:
print age_of.get("god", "unknown")


unknown
"in" tests if a given key exists:

In [250]:
print "uwe" in age_of


False
keys and values of a dictionary can be extraced as lists ...

In [251]:
print age_of.keys(), age_of.values()


['jan', 'universe', 'alan'] [18, 13800000000.0, 27]
... or as (key, value) tuples in a list:

In [252]:
print age_of.items()


[('jan', 18), ('universe', 13800000000.0), ('alan', 27)]
tuple unpacking again:

In [254]:
for who, age in age_of.items():
    print who, "is", age, "years old"


jan is 18 years old
universe is 13800000000.0 years old
alan is 27 years old
Example: counting digits

In [255]:
a = "abcdabcdfdcdcabcabcabcabcabcaaaab"
counter = dict()
for ai in a:
    if ai not in counter:
        counter[ai] = 0
    counter[ai] += 1
print counter["a"]


11
shorter:

In [259]:
a = "abcdabcdfdcdcabcabcabcabcabcaaaab"
counter = dict()
for ai in a:
    counter[ai] = counter.get(ai, 0) + 1
print counter["a"]


11
even shorter:

In [260]:
from collections import Counter
a = "abcdabcdfdcdcabcabcabcabcabcaaaab"
counter = Counter(a)
print counter["a"]


11

Python collections: sets

compared to lists, sets have no order and no duplicate elements. They can be constructed from a list or a tuple:

In [261]:
a = set((1, 2, 3))
b = set([2, 3, 4])
print a, b


set([1, 2, 3]) set([2, 3, 4])
set union:

In [262]:
print a | b


set([1, 2, 3, 4])

In [264]:
print a.union(b)


set([1, 2, 3, 4])
set intersection

In [263]:
print a & b


set([2, 3])

In [265]:
print a.intersection(b)


set([2, 3])
set difference:

In [266]:
print a - b


set([1])

In [267]:
print a.difference(b)


set([1])
set() is the empty set:

In [268]:
print type(set()), len(set())


<type 'set'> 0
PRACTICE TIME: 1. Enter and try to understand: a = "abcdabcdfdcdcabcabcabcabcabcaaaab" counter = dict() for ai in a: print "ai=", ai counter[ai] = counter.get(ai, 0) + 1 print "counter=", counter 2. Write a one line expression which counts the unique elements in a given list [1, 3, 1, 2, 7, 3, 2] 3. Write a script which transforms a given string by replacing a->x e->y e->z o->j u->q Use a dictionary. Hint: first create a list then use strings join method !

Collections in collections

As everything in Python is an "object" and the proposed collection types group all kinds of "objects", one can deeply nest collections:

In [269]:
a = [1, [2,3], (3, 4, 5), "abc", { 3: 4 }]
for ai in a:
    print ai


1
[2, 3]
(3, 4, 5)
abc
{3: 4}

In [270]:
a[1].append(4)
print a


[1, [2, 3, 4], (3, 4, 5), 'abc', {3: 4}]

In [271]:
numbers = { "even": [2, 4, 6], "odd": [1, 3, 5] }

In [272]:
print numbers["even"][0]


2

Restrictions for collections in collections

Keys in dictionaries and items in sets must be "hashable". We will not discuss this deeper, but remenber "lists are not hashable". So the following statements fails:

In [273]:
a = { [1,2] : 2 }


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-273-2d0cf91d660b> in <module>()
----> 1 a = { [1,2] : 2 }

TypeError: unhashable type: 'list'

In [274]:
a = set()
a.add([1, 2])


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-274-d58d292c1821> in <module>()
      1 a = set()
----> 2 a.add([1, 2])

TypeError: unhashable type: 'list'

List comprehensions

Python has a powerful construct which is similar to the math notation { f(x) | x \in A and p(x) }: Create a list with values 3*0 ... 3*9:

In [275]:
values = [3 * a for a in range(10)]
print values


[0, 3, 6, 9, 12, 15, 18, 21, 24, 27]
Another example:

In [276]:
print [3*i for i in values if i % 7 == 6]


[18, 81]
This replaces 90% of all loops and is very readable !
PRACTICE: 1. for a given list of numbers [ai] write a list comprehension for computing numbers [2*ai+1], eg [2, 3, 4] -> [5, 7, 9] 2. Use list a nested comprehension for computing [ [1, 2, 3], [2, 4, 6], [3, 6, 9]]

FUNCTIONS

In Python "def" declares a function:

In [277]:
def add(a, b, c):
    return a + b + c

In [278]:
print add(1, 2, 3)


6
But arguments are not typed. The function works if "+" is somehow defined:

In [279]:
print add("a", "bc", "def")


abcdef
This one fails, because 1 + "a" + 3 is not defined:

In [280]:
print add(1, "a", 3)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-280-1120d14149a2> in <module>()
----> 1 print add(1, "a", 3)

<ipython-input-277-d3a5ee10ec2c> in add(a, b, c)
      1 def add(a, b, c):
----> 2     return a + b + c

TypeError: unsupported operand type(s) for +: 'int' and 'str'
For mixed types:

In [281]:
def mul(a, b):
    return a * b

In [282]:
print mul(3, 4)


12

In [283]:
print mul(3, "abc")


abcabcabc

Keyword arguments

You can arange the arguments when calling a function in arbirary order if you use "keyword arguments":

In [284]:
add(c=1, b=2, a=3)


Out[284]:
6
You can mix this, first argument is positional argument, followed by two keyword arguments:

In [285]:
add(1, c=2, b=3)


Out[285]:
6
But this fails:

In [286]:
add(a=2, b=3, 3)


  File "<ipython-input-286-dd0fb32a1ea6>", line 1
    add(a=2, b=3, 3)
SyntaxError: non-keyword arg after keyword arg

Default arguments

Some arguments may have default values, which are used if you omit them when calling a function:

In [287]:
def greet(name, salute_with="hello"):
    print salute_with, name, "!"

greet("jan")


hello jan !

In [288]:
greet("urs", "gruezi")


gruezi urs !

Arbitrary many number of arguments

If you want to declare a function which takes an arbitrary number of arguments, you can declare the function as follows:

In [289]:
# arbitrary many arguments:
def multiply(v0, *values):
    print "got input arguments", v0, "and", values
    product = v0
    for v in values:
        product *= v
    return product
 
print multiply(2, 1, 13, 1)


got input arguments 2 and (1, 13, 1)
26

Multiple return values

Thanks to tuples, Python functions can return more than one value:

In [290]:
def func(a, b, c):
    result1 = a + b * c
    result2 = a - b / c
    return result1, result2
Tuple unpacking makes it easy to grab multiple return values:

In [291]:
r1, r2 = func(1, 2, 3)
print r1, r2


7 1

Doc strings

The first string below a function declaration is called a doc string. Calling help() on this function will return this string.


In [292]:
def magic(a, b, c):
    """ this function is magic !
        it take a, b, c and returns the sum 
    """
    return a + b + c

In [293]:
help(magic)


Help on function magic in module __main__:

magic(a, b, c)
    this function is magic !
    it take a, b, c and returns the sum

Functions are "objects" as numbers, lists, ...

You can pass a function to a variable:


In [294]:
new_name = magic
print new_name(1, 2, 3)


6

And you can pass a function as an argument to a function:


In [295]:
def eval_function(fun, *a):
    return fun(*a)

print eval_function(set, (1, 2, 3, 4, 5))


set([1, 2, 3, 4, 5])
BE CAREFUL: [] as default argument: (One of the few dark cornes of Python)

In [296]:
def add(a, li= []):
    li.append(a)
    return li

li = add(3)
li2 = add(4)
print li


[3, 4]

In [297]:
print li is li2


True
Solution:

In [298]:
def add(a, li=None):
    if li is None:
        li = []
    li.append(a)
    return li

li = add(3)
li2 = add(4)
print li


[3]

In [299]:
print li is li2


False
PRACTICE: 1. Write a function which takes a radius of a circle and returns the circles area 2. Write a function which takes two numbers m,n, and returns the sum and the product of 1*1 ... m*n 3. Modify this function, so that a missing "m" is respected as m=1 4. Write a function with arbitrary arguments a1, .. an which returns a1 * a2 + a2 * a3 + a3 * a4 + ...

CLASSES

* Objects group values and functions which we call methods in this context. * A class describes how an object is created and defines the methods. * An object is also called "instance of the class".

In [307]:
# first version

class SimpleAddress(object):
    
    """This class represents a person living in a 
       city.
    """
       
    def say_hello(self, formula):
        """ greets the person """
        print formula, self.name, "from", self.city

# maual setting of attributes:   
a = SimpleAddress()
a.name = "tina turner"
a.city = "zürich"

# when calling say_hello 'self' is replaced by 'a':
a.say_hello("gruezi")

# TODO: insert print statements


gruezi tina turner from zürich

In [308]:
class SimpleAddress(object):
        
    def __init__(self, name, city):
        self.name = name
        self.city = city
       
    def say_hello(self, formula):
        print formula, self.name, "from", self.city

# this 'constructs' an instance aa and calls __init__ with self 
# replaced by a and following arguments "tina turner", "zürich"
a = SimpleAddress("tina turner", "zürich")

# when calling say_hello 'self' is replaced by 'a':
a.say_hello("gruezi")


gruezi tina turner from zürich
* "self" is used by convention, you could use "this" or anything else * You have to declare this parameter when defining a method * Normally you don't use it when calling the method on a given object

* what is going on ?


In [309]:
# 1st: methods are attached to the class:
print SimpleAddress.say_hello


<unbound method SimpleAddress.say_hello>

In [311]:
print SimpleAddress.__init__


<unbound method SimpleAddress.__init__>

In [312]:
# 2nd: address.say_hello("gruezi") is handled internally as follows:
SimpleAddress.say_hello(a, "gruezi")


gruezi tina turner from zürich

In [314]:
# address = Address("tina turner", "zürich") is handled as follows:
address = SimpleAddress.__new__(Address)
SimpleAddress.__init__(a, "tina turner", "zürich")
print a.name


tina turner
PRACTICE TIME: Write a class which ...

Overloading operations

Some methods have a special meaning. We have seen __init__ above. We declare __add__ and __str__ below:

In [315]:
class Vector3D(object):
    
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
        
    def __add__(self, other):
        return Vector3D(self.x + other.x,
                        self.y + other.y,
                        self.z + other.z)
        
    def __str__(self):
        return "Vector(%f, %f, %f)" % (self.x, self.y, self.z)
    
v1 = Vector3D(1.0, 2.0, 3.0)
__str__ defines how an object can be casted / transformed to a string:

In [316]:
print str(v1)


Vector(1.000000, 2.000000, 3.000000)
This transformation is automatically called when printing an object, so __str__ is a convenient way to return printable information about an object:

In [317]:
print v1


Vector(1.000000, 2.000000, 3.000000)
Using "+" with objects on both sides calls __add__. In the following example __add__ is called with arguments self=v1 and other=v2:

In [318]:
v2 = Vector3D(2.0, 0.0, -1.0)
v3 = v1 + v2
print v3


Vector(3.000000, 2.000000, 2.000000)
These were only a few examples, there are may other special methods as __mul__, __len__, __contains__ etc
PRACTICE TIME 1. implement a method Vector.scale which takes a number and scales the vector by this number 2. implement a method __mul__ which calculates the dot product of two vectors

* Inheritance


In [319]:
class Dog(object):
    
    def __init__(self, name):
        # print "\n.. this is Dog.__init__, self is", self
        self.name = name
        
    def greet(self):
        # print "\n.. this is Dog.greet, self is", self
        print "hi", self.name
        
    def say(self):
        # print "\n.. this is Dog.say, self is", self
        print "barf"
          
d = Dog("hasso")
d.greet()
d.say()


hi hasso
barf

In [320]:
class SuperDog(Dog):
    
    def __init__(self):
        #print "\n.. this is SuperDog.__init__, self is", self
        super(SuperDog, self).__init__("fifi")
    
    def say(self): 
        #print "\n.. this is SuperDog.say, self is", self
        print "BARF !!!"
        
sd = SuperDog()
sd.greet()
sd.say()


hi fifi
BARF !!!

File I/O

The following exaple opens a file for writing:

In [321]:
fp = open("text.txt", "w")
print type(fp)


<type 'file'>
So "fp" is an instance of class file. There are a two ways to write to this file:

In [322]:
fp.write("line 1\n")
print >> fp, "line 2"
fp.close()
Reading the full content of a file is done by calling "read" on the file object:

In [324]:
print open("text.txt", "r").read()


line 1
line 2

Calling "readlines" on the file object returns a list containing the separate lines of the file:

In [325]:
print open("text.txt", "r").readlines()


['line 1\n', 'line 2\n']
Iterating over a file instance works:

In [326]:
for line in open("text.txt", "r"):
    print repr(line)


'line 1\n'
'line 2\n'

In [327]:
# use enumeration
for (i, line) in enumerate(open("text.txt", "r")):  # tuple unpacking
    print "line", i, "is", repr(line)


line 0 is 'line 1\n'
line 1 is 'line 2\n'
PRACTICE TIME iterate over all files with extension "*.py" in a given directory and print for each file the number of lines in this file. provide an overall sum at the end. HINT: "import glob"

Exceptions

We already know some an exception:

In [328]:
print 1/0


---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-328-e19d6e6ac7e1> in <module>()
----> 1 print 1/0

ZeroDivisionError: integer division or modulo by zero
On can raise this exception in a controlled way:

In [329]:
raise ZeroDivisionError("this is some extra information which is printed below")


---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-329-0913c8b046b9> in <module>()
----> 1 raise ZeroDivisionError("this is some extra information which is printed below")

ZeroDivisionError: this is some extra information which is printed below
On can catch exceptions and handle them.

In [330]:
int("abc")


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-330-908f2fe4faa4> in <module>()
----> 1 int("abc")

ValueError: invalid literal for int() with base 10: 'abc'

In [331]:
try:
    int("abc")
except ValueError:
    print '"abc" is not a valid number'


"abc" is not a valid number
Exceptions avoid encoding errors in function return values. The user of a function can decide at which level an error is handled...

In [332]:
def divide(a, b):
    return a / b

def secure_divide(a, b):
    try:
        result = divide(a, b)
    except:
        result = None
    return result

print secure_divide(12, 4), secure_divide(3, 0)


3 None

Write you own module


In [333]:
# save the example above in a file "secure_math.py"
import secure_math
print secure_math.secure_divide(12, 4)


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-333-9168c65284b5> in <module>()
      1 # save the example above in a file "secure_math.py"
----> 2 import secure_math
      3 print secure_math.secure_divide(12, 4)

ImportError: No module named secure_math
TOPICS NOT COVERED * generators * decorators * module packages * package installation

In [334]:
!pip install requests


Requirement already satisfied (use --upgrade to upgrade): requests in /home/uweschmitt/python_kurs/lib/python2.7/site-packages
Cleaning up...

In [337]:
import requests
print requests.get("http://telize.com/jsonip").json()


{u'ip': u'188.97.232.201'}

numpy = container types for vectors, matrices and n-th order tensors

Vectors


In [345]:
import numpy as np
x = np.array((1.0, 2, 3))

In [346]:
print x


[ 1.  2.  3.]
An numpy array has a common type of all elements:

In [347]:
print x.dtype


float64

In [348]:
# indexing and slicing
print x[0], x[::2]


1.0 [ 1.  3.]

In [349]:
# assignment:
x[::2] += 1
print x


[ 2.  2.  4.]

In [350]:
print 2 * x + x


[  6.   6.  12.]

In [351]:
print x * x


[  4.   4.  16.]

In [352]:
print np.dot(x, x)


24.0

In [353]:
print np.sin(x)


[ 0.90929743  0.90929743 -0.7568025 ]

In [354]:
print x.shape


(3,)

Special vector creation


In [355]:
# range with known stepsize 0.3
np.arange(0, 2, 0.3)


Out[355]:
array([ 0. ,  0.3,  0.6,  0.9,  1.2,  1.5,  1.8])

In [356]:
# range with known number of points
np.linspace(0, np.pi, 10)


Out[356]:
array([ 0.        ,  0.34906585,  0.6981317 ,  1.04719755,  1.3962634 ,
        1.74532925,  2.0943951 ,  2.44346095,  2.7925268 ,  3.14159265])

Matrices


In [369]:
mat = np.array(((1, 2, 3), (2, 3, 4)), dtype=np.float)

In [370]:
print mat


[[ 1.  2.  3.]
 [ 2.  3.  4.]]

In [371]:
print mat.shape


(2, 3)
"*" is element wise, not "matrix multiplication":

In [372]:
mat * mat


Out[372]:
array([[  1.,   4.,   9.],
       [  4.,   9.,  16.]])
np.dot is matrix x vector multiplication, aka "inner product":

In [373]:
np.dot(mat, x)


Out[373]:
array([ 18.,  26.])
transpose of mat :

In [374]:
print mat.T


[[ 1.  2.]
 [ 2.  3.]
 [ 3.  4.]]

In [375]:
print np.dot(mat, mat.T)


[[ 14.  20.]
 [ 20.  29.]]

Special matrices


In [376]:
print np.eye(3)


[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]

In [377]:
print np.zeros((2, 3))


[[ 0.  0.  0.]
 [ 0.  0.  0.]]

In [378]:
np.diag(np.arange(1, 4))


Out[378]:
array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

In [379]:
# complex matrices

In [380]:
mat = np.array(((1.0, 1+1j), (1-1j, 2.0)), dtype=np.complex)
print mat


[[ 1.+0.j  1.+1.j]
 [ 1.-1.j  2.+0.j]]

In [381]:
print np.dot(mat, mat.T.conj())


[[ 3.+0.j  3.+3.j]
 [ 3.-3.j  6.+0.j]]

Changing the shape of an array


In [382]:
vec = np.arange(3)
print vec


[0 1 2]

In [384]:
# create matrix with artificial column dimensions
vec_as_matrix = vec[:, np.newaxis]
print vec_as_matrix


[[0]
 [1]
 [2]]

In [385]:
# and back:
vec_back = vec_as_matrix.squeeze()
print vec_back


[0 1 2]

In [401]:
vec = np.arange(24, dtype=np.int)
print vec


[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
reshape "reshapes" an array, one dimension results from the other dimensions, you can use "-1" for this:

In [402]:
mat = vec.reshape(3, -1)
print mat


[[ 0  1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14 15]
 [16 17 18 19 20 21 22 23]]

In [403]:
print mat.reshape(-1, 3)


[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]
 [15 16 17]
 [18 19 20]
 [21 22 23]]
Even tensors:

In [404]:
print mat.reshape(3, 4, -1)


[[[ 0  1]
  [ 2  3]
  [ 4  5]
  [ 6  7]]

 [[ 8  9]
  [10 11]
  [12 13]
  [14 15]]

 [[16 17]
  [18 19]
  [20 21]
  [22 23]]]

Fast indexing


In [405]:
print mat


[[ 0  1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14 15]
 [16 17 18 19 20 21 22 23]]

In [406]:
mat % 3 == 0


Out[406]:
array([[ True, False, False,  True, False, False,  True, False],
       [False,  True, False, False,  True, False, False,  True],
       [False, False,  True, False, False,  True, False, False]], dtype=bool)

In [407]:
mat[mat % 3 == 0] = 0
print mat


[[ 0  1  2  0  4  5  0  7]
 [ 8  0 10 11  0 13 14  0]
 [16 17  0 19 20  0 22 23]]

Type coercion:


In [408]:
print mat.dtype
mat = mat.astype(np.float)
print mat.dtype
print mat


int64
float64
[[  0.   1.   2.   0.   4.   5.   0.   7.]
 [  8.   0.  10.  11.   0.  13.  14.   0.]
 [ 16.  17.   0.  19.  20.   0.  22.  23.]]

numpy File I/O

Binary files


In [409]:
mat.tofile("matrix.bin")
print mat.dtype


float64

In [410]:
print np.fromfile("matrix.bin", dtype=np.float64)


[  0.   1.   2.   0.   4.   5.   0.   7.   8.   0.  10.  11.   0.  13.  14.
   0.  16.  17.   0.  19.  20.   0.  22.  23.]

In [411]:
print np.fromfile("matrix.bin", dtype=np.float64).reshape(3, -1)


[[  0.   1.   2.   0.   4.   5.   0.   7.]
 [  8.   0.  10.  11.   0.  13.  14.   0.]
 [ 16.  17.   0.  19.  20.   0.  22.  23.]]

In [412]:
np.fromfile("matrix.bin", dtype=complex)


Out[412]:
array([  0. +1.j,   2. +0.j,   4. +5.j,   0. +7.j,   8. +0.j,  10.+11.j,
         0.+13.j,  14. +0.j,  16.+17.j,   0.+19.j,  20. +0.j,  22.+23.j])

In [414]:
np.fromfile("matrix.bin", dtype=np.int)


Out[414]:
array([                  0, 4607182418800017408, 4611686018427387904,
                         0, 4616189618054758400, 4617315517961601024,
                         0, 4619567317775286272, 4620693217682128896,
                         0, 4621819117588971520, 4622382067542392832,
                         0, 4623507967449235456, 4624070917402656768,
                         0, 4625196817309499392, 4625478292286210048,
                         0, 4626041242239631360, 4626322717216342016,
                         0, 4626885667169763328, 4627167142146473984])

Alternative: txt file


In [415]:
np.savetxt("matrix.txt", mat)

In [416]:
np.loadtxt("matrix.txt")


Out[416]:
array([[  0.,   1.,   2.,   0.,   4.,   5.,   0.,   7.],
       [  8.,   0.,  10.,  11.,   0.,  13.,  14.,   0.],
       [ 16.,  17.,   0.,  19.,  20.,   0.,  22.,  23.]])

In [417]:
print open("matrix.txt").read()


0.000000000000000000e+00 1.000000000000000000e+00 2.000000000000000000e+00 0.000000000000000000e+00 4.000000000000000000e+00 5.000000000000000000e+00 0.000000000000000000e+00 7.000000000000000000e+00
8.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+01 1.100000000000000000e+01 0.000000000000000000e+00 1.300000000000000000e+01 1.400000000000000000e+01 0.000000000000000000e+00
1.600000000000000000e+01 1.700000000000000000e+01 0.000000000000000000e+00 1.900000000000000000e+01 2.000000000000000000e+01 0.000000000000000000e+00 2.200000000000000000e+01 2.300000000000000000e+01

Plotting

matplotlib is "the" Python library for plotting. pylab uses matplotlib and provides a matlab like interface:

In [422]:
# this is just for this presentation !
%matplotlib inline

In [423]:
import pylab
x = np.linspace(0, 2 * np.pi, 150)
y = np.sin(x) * np.cos(x*x+1)

pylab.plot(x, y, label="y")

y2 = y * np.sin(y)
pylab.plot(x, y2, label="y2")

pylab.legend()
pylab.show()


Final example: pca


In [426]:
# 2 x 1000 matrix with normal distributed entries:
points = np.random.randn(2, 1000)

In [427]:
import pylab
pylab.plot(points[0,:], points[1,:], ".")
pylab.show()



In [428]:
deformation_matrix = np.array(((1.5, 1.0), (0.9, 1.0)))

In [429]:
x = np.dot(deformation_matrix, points)

In [430]:
import pylab
pylab.plot(x[0,:], x[1,:], ".")
pylab.show()



In [431]:
# np.linalg.eigh is eigenvalue decomposition of hermetian matrix
# np.linalg has others like 
# np.linalg.solve for solving linear equation systems
# np.linalg.svd for singular value decomposition and others

# this is pca for centered matrix 
eigvals, eigvecs = np.linalg.eigh(np.dot(x, x.T))

In [432]:
# column vectors describe main directions in data:
print eigvecs


[[ 0.59546539 -0.80338096]
 [-0.80338096 -0.59546539]]

In [433]:
# plot data again
pylab.plot(x[0,:], x[1,:], ".")

# plot lines describing directions of main variance:
pylab.plot([-3*eigvecs[0,0], 3*eigvecs[0,0]], [-3 * eigvecs[0,1], 3 * eigvecs[0,1]], 'g', linewidth=2)
pylab.plot([-3*eigvecs[1,0], 3*eigvecs[1,0]], [-3 * eigvecs[1,1], 3 * eigvecs[1,1]], 'g', linewidth=2)

pylab.show()



In [ ]:


In [ ]: