This is a sample chapter from Learning IPython for Interactive Computing and Data Visualization, second edition.
If you don't know Python, read this section to learn the fundamentals. Python is a very accessible language and is even taught to school children. If you have ever programmed, it will only take you a few minutes to learn the basics.
Open a new notebook and type the following in the first cell:
In [1]:
print("Hello world!")
Out[1]:
TIP (Prompt string): Note that the convention chosen in this book is to show Python code (also called the
input
) prefixed withIn [x]:
(which shouldn't be typed). This is the standard IPython prompt. Here, you should just typeprint("Hello world!")
and then pressShift
-Enter
.
Congratulations! You are now a Python programmer.
Let's use Python as a calculator.
In [2]:
2 * 2
Out[2]:
Here, 2 * 2
is an expression statement. This operation is performed, the result is returned, and IPython displays it in the notebook cell's output.
TIP (Division): In Python 3,
3 / 2
returns1.5
(floating-point division), whereas it returns1
in Python 2 (integer division). This can be source of errors when porting Python 2 code to Python 3. It is recommended to always use the explicit3.0 / 2.0
for floating-point division (by using floating-point numbers) and3 // 2
for integer division. Both syntaxes work in Python 2 and Python 3. See http://python3porting.com/differences.html#integer-division for more details.
Other built-in mathematical operators include +
, -
, **
for the exponentiation, and others. You will find more details at https://docs.python.org/3/reference/expressions.html#the-power-operator.
Variables form a fundamental concept of any programming language. A variable has a name and a value. Here is how to create a new variable in Python:
In [3]:
a = 2
And here is how to use an existing variable:
In [4]:
a * 3
Out[4]:
Several variables can be defined at once (this is called unpacking):
In [5]:
a, b = 2, 6
There are different types of variables. Here, we have used a number (more precisely, an integer). Other important types include floating-point numbers to represent real numbers, strings to represent text, and booleans to represent True/False
values. Here are a few examples:
In [6]:
somefloat = 3.1415
sometext = 'pi is about' # You can also use double quotes.
print(sometext, somefloat) # Display several variables.
Out[6]:
Note how we used the #
character to write comments. Whereas Python discards the comments completely, adding comments in the code is important when the code is to be read by other humans (including yourself in the future).
String escaping refers to the ability to insert special characters in a string. For example, how can you insert '
and "
, given that these characters are used to delimit a string in Python code? The backslash \
is the go-to escape character in Python (and in many other languages too). Here are a few examples:
In [7]:
print("Hello \"world\"")
print("A list:\n* item 1\n* item 2")
print("C:\\path\\on\\windows")
print(r"C:\path\on\windows")
Out[7]:
The special character \n
is the new line (or line feed) character. To insert a backslash, you need to escape it, which explains why it needs to be doubled as \\
.
You can also disable escaping by using raw literals with a r
prefix before the string, like in the last example above. In this case, backslashes are considered as normal characters.
This is convenient when writing Windows paths, since Windows uses backslash separators instead of forward slashes like on Unix systems. A very common error on Windows is forgetting to escape backslashes in paths: writing "C:\path"
may lead to subtle errors.
You will find the list of special characters in Python at https://docs.python.org/3.4/reference/lexical_analysis.html#string-and-bytes-literals
A list contains a sequence of items. You can concisely instruct Python to perform repeated actions on the elements of a list. Let's first create a list of numbers:
In [8]:
items = [1, 3, 0, 4, 1]
Note the syntax we used to create the list: square brackets []
, and commas ,
to separate the items.
The built-in function len()
returns the number of elements in a list:
In [9]:
len(items)
Out[9]:
INFO (Built-in functions): Python comes with a set of built-in functions, including
print()
,len()
,max()
, functional routines likefilter()
andmap()
, and container-related routines likeall()
,any()
,range()
andsorted()
. You will find the full list of built-in functions at https://docs.python.org/3.4/library/functions.html.
Now, let's compute the sum of all elements in the list. Python provides a built-in function for this:
In [10]:
sum(items)
Out[10]:
We can also access individual elements in the list, using the following syntax:
In [11]:
items[0]
Out[11]:
In [12]:
items[-1]
Out[12]:
Note that indexing starts at 0
in Python: the first element of the list is indexed by 0
, the second by 1
, and so on. Also, -1
refers to the last element, -2
, to the penultimate element, and so on.
The same syntax can be used to alter elements in the list:
In [13]:
items[1] = 9
items
Out[13]:
We can access sublists with the following syntax:
In [14]:
items[1:3]
Out[14]:
Here, 1:3
represents a slice going from element 1
included (this is the second element of the list) to element 3
excluded. Thus, we get a sublist with the second and third element of the original list. The first-included/last-excluded asymmetry leads to an intuitive treatment of overlaps between consecutive slices. Also, note that a sublist refers to a dynamic view of the original list, not a copy; changing elements in the sublist automatically changes them in the original list.
Python provides several other types of containers:
Tuples are immutable and contain a fixed number of elements:
In [15]:
my_tuple = (1, 2, 3)
my_tuple[1]
Out[15]:
Dictionaries contain key-value pairs. They are extremely useful and common:
In [16]:
my_dict = {'a': 1, 'b': 2, 'c': 3}
print('a:', my_dict['a'])
Out[16]:
In [17]:
print(my_dict.keys())
Out[17]:
There is no notion of order in a dictionary. However, the native collections module provides an OrderedDict
structure that keeps the insertion order (see https://docs.python.org/3.4/library/collections.html).
Sets, like mathematical sets, contain distinct elements:
In [18]:
my_set = set([1, 2, 3, 2, 1])
my_set
Out[18]:
INFO (Mutable and immutable objects): A Python object is mutable if its value can change after it has been created. Otherwise, it is immutable. For example, a string is immutable; to change it, a new string needs to be created. A list, a dictionary, or a set is mutable; elements can be added or removed. By contrast, a tuple is immutable, and it is not possible to change the elements it contains without recreating the tuple. See https://docs.python.org/3.4/reference/datamodel.html for more details.
We can run through all elements of a list using a for
loop:
In [19]:
for item in items:
print(item)
Out[19]:
There are several things to note here:
for item in items
syntax means that a temporary variable named item
is created at every iteration. This variable contains the value of every item in the list, one at a time.:
at the end of the for
statement. Forgetting it will lead to a syntax error!print(item)
will be executed for all items in the list.print
: this is called the indentation. You will find more details about indentation in the next subsection.Python supports a concise syntax to perform a given operation on all elements of a list:
In [20]:
squares = [item * item for item in items]
squares
Out[20]:
This is called a list comprehension. A new list is created here; it contains the squares of all numbers in the list. This concise syntax leads to highly readable and Pythonic code.
Indentation refers to the spaces that may appear at the beginning of some lines of code. This is a particular aspect of Python's syntax.
In most programming languages, indentation is optional and is generally used to make the code visually clearer. But in Python, indentation also has a syntactic meaning. Particular indentation rules need to be followed for Python code to be correct.
In general, there are two ways to indent some text: by inserting a tab character (also referred as \t
), or by inserting a number of spaces (typically, four). It is recommended to use spaces instead of tab characters. Your text editor should be configured such that the Tabular key on the keyboard inserts four spaces instead of a tab character.
In the Notebook, indentation is automatically configured properly; so you shouldn't worry about this issue. The question only arises if you use another text editor for your Python code.
Finally, what is the meaning of indentation? In Python, indentation delimits coherent blocks of code, for example, the contents of a loop, a conditional branch, a function, and other objects. Where other languages such as C or JavaScript use curly braces to delimit such blocks, Python uses indentation.
Sometimes, you need to perform different operations on your data depending on some condition. For example, let's display all even numbers in our list:
In [21]:
for item in items:
if item % 2 == 0:
print(item)
Out[21]:
Again, here are several things to note:
if
statement is followed by a boolean expression.a
and b
are two integers, the modulo operand a % b
returns the remainder from the division of a
by b
. Here, item % 2
is 0 for even numbers, and 1 for odd numbers.==
to avoid confusion with the assignment operator =
that we use when we create variables.for
loop, the if
statement ends with a colon :
.if
statement. It is indented. Indentation is cumulative: since this if
is inside a for
loop, there are eight spaces before the print(item)
statement.Python supports a concise syntax to select all elements in a list that satisfy certain properties. Here is how to create a sublist with only even numbers:
In [22]:
even = [item for item in items if item % 2 == 0]
even
Out[22]:
This is also a form of list comprehension.
Code is typically organized into functions. A function encapsulates part of your code. Functions allow you to reuse bits of functionality without copy-pasting the code. Here is a function that tells whether an integer number is even or not:
In [23]:
def is_even(number):
"""Return whether an integer is even or not."""
return number % 2 == 0
There are several things to note here:
def
keyword.def
comes the function name. A general convention in Python is to only use lowercase characters, and separate words with an underscore _
. A function name generally starts with a verb.number
.:
at the end of the def
statement)."""
. This is a particular form of comment that explains what the function does. It is not mandatory, but it is strongly recommended to write docstrings for the functions exposed to the user.return
keyword in the body of the function specifies the output of the function. Here, the output is a Boolean, obtained from the expression number % 2 == 0
. It is possible to return several values; just use a comma to separate them (in this case, a tuple of Booleans would be returned).Once a function is defined, it can be called like this:
In [24]:
is_even(3)
Out[24]:
In [25]:
is_even(4)
Out[25]:
Here, 3 and 4 are successively passed as arguments to the function.
A Python function can accept an arbitrary number of arguments, called positional arguments. It can also accept optional named arguments, called keyword arguments. Here is an example:
In [26]:
def remainder(number, divisor=2):
return number % divisor
The second argument of this function, divisor
, is optional. If it is not provided by the caller, it will default to the number 2, as show here:
In [27]:
remainder(5)
Out[27]:
There are two equivalent ways of specifying a keyword argument when calling a function:
In [28]:
remainder(5, 3)
Out[28]:
In [29]:
remainder(5, divisor=3)
Out[29]:
In the first case, 3
is understood as the second argument, divisor
. In the second case, the name of the argument is given explicitly by the caller. This second syntax is clearer and less error-prone than the first one.
Functions can also accept arbitrary sets of positional and keyword arguments, using the following syntax:
In [30]:
def f(*args, **kwargs):
print("Positional arguments:", args)
print("Keyword arguments:", kwargs)
In [31]:
f(1, 2, c=3, d=4)
Out[31]:
Inside the function, args
is a tuple containing positional arguments, and kwargs
is a dictionary containing keyword arguments.
When passing a parameter to a Python function, a reference to the object is actually passed (passage by assignment):
Here is an example:
In [32]:
my_list = [1, 2]
def add(some_list, value):
some_list.append(value)
add(my_list, 3)
my_list
Out[32]:
The function add()
modifies an object defined outside it (in this case, the object my_list
); we say this function has side-effects. A function with no side-effects is called a pure function: it doesn't modify anything in the outer context, and it deterministically returns the same result for any given set of inputs. Pure functions are to be preferred over functions with side-effects.
Knowing this can help you spot out subtle bugs. There are further related concepts that are useful to know, including function scopes, naming, binding, and more. Here are a couple of links:
Let's discuss about errors in Python. As you learn, you will inevitably come across errors and exceptions. The Python interpreter will most of the time tell you what the problem is, and where it occurred. It is important to understand the vocabulary used by Python so that you can more quickly find and correct your errors.
Let's see an example:
In [33]:
def divide(a, b):
return a / b
In [34]:
divide(1, 0)
Out[34]:
Here, we defined a divide()
function, and called it to divide 1 by 0. Dividing a number by 0 is an error in Python. Here, a ZeroDivisionError
exception was raised. An exception is a particular type of error that can be raised at any point in a program. It is propagated from the innards of the code up to the command that launched the code. It can be caught and processed at any point. You will find more details about exceptions at https://docs.python.org/3/tutorial/errors.html, and common exception types at https://docs.python.org/3/library/exceptions.html#bltin-exceptions.
The error message you see contains the stack trace and the exception's type and message. The stack trace shows all functions calls between the raised exception and the script calling point.
The top frame, indicated by the first arrow ---->
, shows the entry point of the code execution. Here, it is divide(1, 0)
which was called directly in the Notebook. The error occurred while this function was called.
The next and last frame is indicated by the second arrow. It corresponds to line 2 in our function divide(a, b)
. It is the last frame in the stack trace: this means that the error occurred there.
We will see later in this chapter how to debug such errors interactively in IPython and in the Jupyter Notebook. Knowing how to navigate up and down in the stack trace is critical when debugging complex Python code.
Object-oriented programming (or OOP) is a relatively advanced topic. Although we won't use it much in this book, it is useful to know the basics. Also, mastering OOP is often essential when you start to have a large code base.
In Python, everything is an object. A number, a string, a function is an object. An object is an instance of a type (also known as class). An object has attributes and methods, as specified by its type. An attribute is a variable bound to an object, giving some information about it. A method is a function that applies to the object.
For example, the object 'hello'
is an instance of the built-in str
type (string). The type()
function returns the type of an object, as shown here:
In [35]:
type('hello')
Out[35]:
There are native types, like str
or int
(integer), and custom types, also called classes, that can be created by the user.
In IPython, you can discover the attributes and methods of any object with the dot syntax and tab completion. For example, typing 'hello'.u
and pressing Tab automatically shows us the existence of the upper()
method:
In [36]:
'hello'.upper()
Out[36]:
Here, upper()
is a method available to all str
objects; it returns an uppercase copy of a string.
A useful string method is format()
. This simple and convenient templating system lets you generate strings dynamically:
In [37]:
'Hello {0:s}!'.format('Python')
Out[37]:
The {0:s}
syntax means "replace this with the first argument of format()
which should be a string". The variable type after the colon is especially useful for numbers, where you can specify how to display the number (for example, .3f
to display three decimals). The 0
makes it possible to replace a given value several times in a given string. You can also use a name instead of a position, for example 'Hello {name}!'.format(name='Python')
.
Some methods are prefixed with an underscore _
; they are private and are generally not meant to be used directly. IPython's tab completion won't show you these private attributes and methods unless you explicitly type _
before pressing Tab.
In practice, the most important thing to remember is that appending a dot .
to any Python object and pressing Tab in IPython will show you a lot of functionality pertaining to that object.
Python is a multi-paradigm language; it notably supports imperative, object-oriented, and functional programming models. Python functions are objects and can be handled like other objects. In particular, they can be passed as arguments to other functions (also called higher-order functions). This the essence of functional programming.
Decorators provide a convenient syntax construct to define higher-order functions. Here is an example using the is_even()
function from the previous Functions section:
In [38]:
def show_output(func):
def wrapped(*args, **kwargs):
output = func(*args, **kwargs)
print("The result is:", output)
return wrapped
The show_output()
function transforms an arbitrary function func()
to a new function, named wrapped()
, that displays the result of the function:
In [39]:
f = show_output(is_even)
f(3)
Out[39]:
Equivalently, this higher-order function can also be used with a decorator:
In [40]:
@show_output
def square(x):
return x * x
In [41]:
square(3)
Out[41]:
You can find more information about Python decorators at https://en.wikipedia.org/wiki/Python_syntax_and_semantics#Decorators and at http://thecodeship.com/patterns/guide-to-python-function-decorators/.
Let's finish this section with a few notes about Python 2 and Python 3 compatibility issues.
There are still some Python 2 code and libraries that are not compatible with Python 3. Therefore, it is sometimes useful to be aware of the differences between the two versions. One of the most obvious differences is that print
is a statement in Python 2, whereas it is a function in Python 3. Therefore, print "Hello"
(without parentheses) works in Python 2 but not in Python 3, while print("Hello")
works in both Python 2 and Python 3.
There are several non-mutually exclusive options to write portable code that works with both versions:
Here are a few references:
You now know the fundamentals of Python, the bare minimum that you will need in this book. As you can imagine, there is much more to say about Python.
There are a few further basic concepts that are often useful and that we cannot cover here, unfortunately. You are highly encouraged to have a look at them in the references given at the end of this section:
range
and enumerate
pass
, break
, and, continue
, to be used in loopsHere are some slightly more advanced concepts that you might find useful if you want to strengthen your Python skills:
with
statements for safely handling contextspickle
module for persisting Python objects on disk and exchanging them across a networkFinally, here are a few references: