The basic unit of code organization in Python is Functions. Don't write classes until you really have to (see this for more details). As you've already seen, functions are defines by the def keyword
In [ ]:
def a_func(x, y, z=1.5):
if z > 1:
return z * (x + y)
else:
return z / (x + y)
As you can see, multiple return statements are just fine. If the end of a function is reached, None is returned.
Each function can have positional and keyword arguments. Keyword arguments are mostly used to specify default values as well as optional values. Keyword arguments can be specified in arbitrary order. And keyword arguments can also be used as positional arguments.
In [ ]:
a_func(3, 4)
In [ ]:
a_func(3, 4, 2)
In [ ]:
a_func(3, 4, z=12)
Keyword arguments must succeed
In [ ]:
a_func(z=12, 3, 4)
To functions there are two different scopes which can contain variables: global and local.
Variables that are assigned within a function are local variables. That namespace is created upon calling the function and destroyed when the functions returns.
In [ ]:
def func():
inner = []
for i in range(5):
inner.append(i)
In [ ]:
func()
inner
Here, a is a local variable. That means a is created when the function is entered, 5 elements are added then it is destroyed when the functions returns. It can't be accessed from the outside.
In [ ]:
a = []
def func():
for i in range(5):
a.append(i)
In [ ]:
func()
a
But what happens when we have conflicting variable names? what do you think will the value of a be after func is called?
In [ ]:
a = []
def func():
a = [1, 2]
for i in range(5):
a.append(i)
In [ ]:
func()
a
To make Python do what was our intent we need to use the global keyword.
In [ ]:
a = []
def func():
global a
a = [1, 2]
for i in range(5):
a.append(i)
In [ ]:
func()
a
In [ ]:
Image('images/mem3.jpg')
Functions can be declared anywhere, even in functions.
In [ ]:
def f(x):
def g(x):
return x + 1
return g(x)**2
print(f(4))
g(3)
In [ ]:
def f():
return 1, 5, 3
a, b, c = f()
print(a, b, c)
But are this really three separate values? Comma-seperates lists of values are of which type?
In [ ]:
def f(text, g):
return g(text)
In [ ]:
f('foo', str.upper)
Don't be scared by the name lambda. It refers the Lambda calculus by Alonzo Church. It really just means "Functions without names". The syntax looks like this:
In [ ]:
lambda x: x ** 2
A lambda function can be assigned to a variable which makes this variable callable.
In [ ]:
f = lambda x: x ** 2
f(2)
These functions often come in handy in data mining, since many data transformation functions take functions as arguments. A typical example is the build-in sorted function.
In [ ]:
sorted??
Say we want to sort the following date by the second value of each tuple.
In [ ]:
data = [(4, 0), (1, 4), (3, 9), (2, 1)]
In [ ]:
sorted(data)
In [ ]:
sorted(data, key=lambda x: x[1])
In [ ]:
def make_closure(a):
def closure():
print('I know the secret: %d' % a)
return closure
In [ ]:
closure = make_closure(5)
closure()
What has just happend?
make_closure function got called. During its execution the function closure got defined and finally returned to the caller (note: the functions is returned!).closure function accesses the scope of the already destroyed function make_closure and receives the value 5 from it.Before talking about Generators, we need to learn more about Interators.
One of the reasons for Pythons popularity is its unified way to iterate over sequences. Not just tuples, dicts and other build-ins but also custom ones. Objects that implement the __iter__() method are automatically iterable and thereby compatible to the vast Python software stack. To give you an idea about the iter protocol, see the following.
In [ ]:
a = [1, 2, 3]
In [ ]:
iter_a = iter(a)
iter_a
In [ ]:
next(iter_a)
A generator is a way to construct a iterable object. Whereas normal functions execute and return a single value, generators return a sequence of values lazily,
pausing after each one until the next one is requested.
Generators are normal functions, except that they use the yield statement rather than the return statement.
In [ ]:
def squares(upper_bound=10):
x = 1
for i in range(upper_bound):
yield i**2
In [ ]:
my_squares = squares()
In [ ]:
next(my_squares)
In [ ]:
list(my_squares)
In contrast to a normal function the execution of a generator is stopped at the yield statement. And the corresponding value gets returned. Execution is continued as soon as next method is called on the generator. That means: Values are generated lazily. Thus, deferring computation and saving memory.
But, it wouldn't be Python if there wouldn't be a shorthand notation for generators.
The syntax for this shorthand is borrowed from list comprehensions.
In [ ]:
generator = (i**2 for i in range(10))
generator
In [ ]:
next(generator)
In [ ]:
list(generator)
In [ ]:
class Rectangle(object):
def __init__(self, x, y):
self.x = x
self.y = y
self._PI = 3.145
def area(self):
return self.x * self.y
def perimeter(self):
return 2 * self.x + 2 * self.y
def _bogus_area(self):
return self.x * self.x
my_rec = Rectangle(2, 4)
my_rec.area(), my_rec.perimeter()
Note that each method is passed a self object. This is similar to Javas this. The only difference is that it is mentioned explicitly.
The __init__ method behaves like a constructor in Java, but actually it isn't. Why?
Methods and members whos name starts with a _ are hidden to other objects (similar to the private modifier in Java). They don't show up when investigating the object.
In [ ]:
my_rec.<TAB>
But objects can be investigated more deeply by their default __dir__ and __dict__ perperties
In [ ]:
my_rec.__dict__
In [ ]:
my_rec.__dir__()
These hidden properties can be used!
In [ ]:
my_rec._PI
In [ ]:
my_rec._bogus_area()
The language itself does not enforce any access restrictions. When Guido van Rossum was asked whether this is a flaw in the language design he replied: "Python is a language for adults. Don't do stupid things, behave like a grownup".
Put differently: "Once you fiddle with those things, you are most likely doing it wrong."
In [ ]:
fh = open('material/foo.py')
print(fh.read())
fh.close()
But opening or reading a file can cause errors.
In [ ]:
fh = open('material/fo0.py')
print(fh.read())
fh.close()
Which potentially leave with an unclosed file.
Let's catch that error instead.
In [ ]:
try:
fh = open('fo0.py')
print(fh.read())
except FileNotFoundError as e:
raise e
finally:
fh.close()
But there is a more pythionic way, a context manager:
In [ ]:
with open('fo0.py') as fh:
print(fh.read())
The os.path has a lot of useful tools to deal with files and directories:
In [ ]:
from os import path
path.<TAB>
os.walk uses the force to accomplish it's goales!
In [ ]:
import os
foo = os.walk('/')
In [ ]:
next(foo)
In [ ]:
from IPython.display import display
with open('images/the_force.gif','rb') as f:
display(Image(f.read()), format='png')
In [ ]:
Image('images/json.jpg')
There are only four methods you need to know about.
In [ ]:
import json
json.<TAB>
In [ ]:
data = {
'numbers': [1,2,3,4,5],
'tuple': (12, 14, 22),
'letters': 'abcdef',
'embedded dict': {
'more': 'date'
}
}
with open('material/data.json', 'w') as fh:
json.dump(data, fh)
In [ ]:
!cat material/data.json
In [ ]:
old_data = None
with open('material/data.json') as fh:
old_data = json.load(fh)
old_data
In [ ]:
Image('images/mem2.jpg')