[Data, the Humanist's New Best Friend](index.ipynb)
*Class 02*

In this class you are expected to learn:

  1. Python syntax
  2. Variables and values
  3. Statements and expressions
  4. Creating and using functions

The Magic of the Tab (the character, not the drink!)


*Not this tab we are talkin'bout*

One of the first things you need to know, is that Python is a indented language. Unlike other languages like C or Java, Python uses visual indentation to group statements in a block. And the most usual length for that indentation is 4 spaces.

Let's see an example of an if/else statement in a couple of popular programming languages.

if(!value){console.log("No value");}else{console.log(value);}

The code below is Javascript, but Javascript does not impose rules on indentation. Instead, blocks of statements need to be enclosed by curly braces, { and }. Therefore, the same code can be written in many different ways.

if(!value) {
console.log("No value");
} else {
console.log(value);
}

Or even

if (!value)
{
    console.log("No value");
}
else
{
    console.log(value);
}

However, the (almost) only way to write the same logic in Python is like this:

if value is None:
    print("No value")
else:
    print(value)

And that is one of the principles of Python

There should be one-- and preferably only one --obvious way to do it.

At the beginning, the use of the indented syntax can be seen as a burden, but eventually you will learn to love it and start to wonder why the rest of languages look so awful.


In [1]:
import this


The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Source code

A program is basically a set of instructions to produce an output from an input. We write programs in programming languages, like Python. And the code of those programs is usually called source code.

Python code is usually stored in text files with the file ending .py:

myprogram.py

Every line in a Python program file is assumed to be a Python statement, or part thereof.

The only exception is comment lines, which start with the character # (optionally preceded by an arbitrary number of white-space characters, i.e., tabs or spaces). Comment lines are usually ignored by the Python interpreter.

To run our Python program from the command line we use:

$ python myprogram.py

Inside IPython, as we are right now, we use:

%run myprogram.py

Activity

Write a Python program that prints a message to the console.


In [3]:
print("Hello")


Hello

Values

Values are things that a program manipulates. There are different types of values:

  • Strings, str: "abcdef"
  • Integers, int: 7, 42, 97
  • Floating-point numbers, float: 3.792, 0.00005, 1.6e+9
  • Boolean, bool: True, False
  • Nothing, NoneType: None

To a computer, the integer 1 is not necessarily the same thing as the floating point number 1.0... because they have different types. Many of the errors you will make in programming result from mixing types inappropriately. Some languages (e.g., C, Fortran, Java) are very militant about types. You have to be totally explicit about them. Python is a little more relaxed. You can be explicit, but you don’t have to be. Python will guess if you don’t tell it. And you can ask Python to tell you its guess for the type of a value.

>>> type(12)
<type 'int'>
>>> type('Witty remark')
<type 'str'>
>>> type(3.75)
<type 'float'>

Activity

Try to guess the result of these *cells* before running them.


In [5]:
type(50.8)


Out[5]:
float

In [6]:
type('Cool story, bro!')


Out[6]:
str

In [7]:
type(10 / 1.5)


Out[7]:
float

It is also possible to convert between types, although some times this can produced unexpected outcomes. That's called type conversion or type casting.


In [8]:
str("45")


Out[8]:
'45'

In [9]:
int("45")


Out[9]:
45

In [15]:
bool("")


Out[15]:
False

In [17]:
int("1.98")


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-f86c1e16b3e7> in <module>()
----> 1 int("1.98")

ValueError: invalid literal for int() with base 10: '1.98'

Variables

Variables are one the most importants concepts in programming.

Variables let you store values in a labelled location.


In [18]:
a = 5
b = 1
c = a + b
c


Out[18]:
6

As stated by the Python library reference, assignment statements, using the symbol = are used to bind and rebind names to values.

In short, it works as follows:

  • An expression on the right hand side is evaluated, the corresponding object is created/obtained.
  • A name on the left hand side is assigned, or bound, to the right hand side value (or object).

Something to notice here is that a single object can have several names bound to it.


In [5]:
a = 23
b = a
a


Out[5]:
23

In [6]:
b


Out[6]:
23

In [7]:
a is b


Out[7]:
True

The key concept here is mutable vs. immutable

  • Mutable objects can be changed in place
  • Immutable objects cannot be modified once created

Try this next:


In [ ]:
65 = m

Variable names and keywords

Generally, you want to choose names for your variables that are meaningful — they document what the variable is used for or what kind of value they store.

You can use whatever name you want, within a few restrictions set by the language. Python wants variable names that contain any combination of (English) letters and numbers (and the character underscore, "_"), but begin with a letter.


In [21]:
my_2nd_variable = "lol"

In [19]:
$invalidvar = 2


  File "<ipython-input-19-d05096476053>", line 1
    $invalidvar = 2
    ^
SyntaxError: invalid syntax

In [22]:
class = "DH 2121G"


  File "<ipython-input-22-3b2d9d966707>", line 1
    class = "DH 2121G"
          ^
SyntaxError: invalid syntax

In the last example, however, the variable class meet the requirements.

The problem is that Python has a small set of keywords that cannot be overwritten.

and as assert break class continue
def del elif else except exec
finally for from global if import
in is lambda not or pass
print raise return try while with
yield

Keywords in Python

Usually, if you need to use more than one word in a variable name, you use underscores to separate them. This improves the readability of your code.


In [15]:
a_really_long_var_name = 9.99e+99

This way of naming variables is know as snake_case, for the resemblance with a snake (I know, you have to use your imagination), and it's used for variable names, and functions and methods.

Nevertheless, Python also uses another naming scheme called CamelCase when defining classes.

If you ever forget the difference between snake_case and CamelCase, use this mnemonic image.


*Remember the two humped CamelCase*

Statements and expressions

A statement is an order or instruction to Python to do something. Whatever the Python interpreter can execute is a statement. We have seen two kinds of statements: print and assignment.

When you type a statement on the command line, or in a IPython cell, Python executes it and displays the result, if there is one. The result of a print statement is a value. Assignment statements don’t produce a result.

A script usually contains a sequence of statements. If there is more than one statement, the results appear one at a time as the statements execute.

For example, the script

print (1)
x = 2
print (x)

produces the output:

1
2

Again, the assignment statement produces no output.

Let's try it.


In [24]:
print("many smart")
doge = 10
print(doge)


many smart
10

An expression is, roughly, a combination of variables and values that can be crunched down to a value.


In [29]:
doge / 5 + 10


Out[29]:
12.0

In [28]:
7 // 2


Out[28]:
3

And to make things better, evaluating an expression is not quite the same thing as printing a value.


In [30]:
message = "What's up, Doc?"
message


Out[30]:
"What's up, Doc?"

In [31]:
print(message)


What's up, Doc?

Activity

Given the next code: ```python a = 10 b = 50 b + a ```
How would you change it to display the final value?


In [33]:
a = 10
b = 50
b + a


Out[33]:
60

Operators

Operators are symbols (e.g., +, -, *, /, and, or, **) that tell Python to perform computations on expressions.

Those computations or operations can be performed on any kind of value, although not all values support all operators. Sometimes this can be confusing.

The following are all legal Python expressions whose meaning is more or less clear:

20+32 hour-1 hour*60+minute minute/60 5**2 (5+9)*(15-7)

When more than one operator appears in an expression, the order of evaluation depends on the rules of precedence. Python follows the same precedence rules for its mathematical operators that mathematics does. The acronym PEMDAS, although not the nicest, is a useful way to remember the order of operations:

  • Parentheses have the highest precedence.
  • Exponentiation has the next highest precedence.
  • Multiplication and Division have the same precedence
  • Addition and Subtraction, which also have the same precedence, have the lowest precedence.

Operators with the same precedence are evaluated from left to right.

So in the expression minute*100/60, minute having the value 59, the multiplication happens first, yielding 5900/60, which in turn yields 98. If the operations had been evaluated from right to left, the result would have been 59*1, which is 59, which is wrong.

Activity

Experiment with the operators you know on strings (instead of just integers). Which ones work? What do they do? Try mixing strings and integers with various operators. What happens there?
*Remember to put `# -- coding: utf-8 --` at the top if you use non-ASCII characters and get an encoding error.*


In [36]:
# For example
"lol " * 40


Out[36]:
'lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol lol '

In [37]:
nobody = "Nobody"
callsme = "calls me"
chicken = "chicken"
nobody + "... " + callsme + "... " + chicken + "!"


Out[37]:
'Nobody... calls me... chicken!'


Execute the next cell and try to understand what is going on.


In [39]:
print("I will now count my chickens:")

print("Hens", 25 + 30 / 6)
print("Roosters", 100 - 25 * 3 % 4)

print("Now I will count the eggs:")

print(3 + 2 + 1 - 5 + 4 % 2 - 1 / 4 + 6)


I will now count my chickens:
Hens 30.0
Roosters 97
Now I will count the eggs:
6.75

In [40]:
print("Is it true that 3 + 2 < 5 - 7?")

print(3 + 2 < 5 - 7)

print("What is 3 + 2?", 3 + 2)
print("What is 5 - 7?", 5 - 7)

print("Oh, that's why it's False.")

print("How about some more.")

print("Is it greater?", 5 > -2)
print("Is it greater or equal?", 5 >= -2)
print("Is it less or equal?", 5 <= -2)


Is it true that 3 + 2 < 5 - 7?
False
What is 3 + 2? 5
What is 5 - 7? -2
Oh, that's why it's False.
How about some more.
Is it greater? True
Is it greater or equal? True
Is it less or equal? False

What is wrong with the next expression?


In [42]:
hour = 5
minute = hour * 60
minute


Out[42]:
300

And what do you think the value of x is going to be?


In [43]:
x = 3
x = x + 7
x = x * x
x


Out[43]:
100

Multiple assignement is also allowed, and that can be used for variable swapping.


In [45]:
a, b, c = 1, 2, 3
a, b = b, a
a, b


Out[45]:
(2, 1)

In [46]:
b, c = c, b
a, b, c


Out[46]:
(2, 3, 1)

For the next class