Model of Computation

Now that we know how computers store data in memory, we need a basic understanding of how computers process that data. Let's explore the simplest, fine-grained operations that a computer can perform. Ultimately, we will draw from these operations to design programs. Programmers think in terms of high-level operations, such as map, search, or filter, but our fingers type fine-grained code patterns associated with those high-level operations.

Canonical processor operations

As we saw in Representing data in memory, a computer's memory holds data temporarily while the processor works on that data. We typically load data into memory from the disk and organize it into a structure that is suitable for the computation we'd like to perform. In the end, though, memory just holds data. All of the action happens in the computer processor (CPU), which performs five principal operations:

  • load small chunks of data from memory into the CPU
  • perform arithmetic computations on data in the CPU
  • store small chunks of data back to memory
  • jump to a new location (this is how we loop)
  • jump to a new location if condition is true

Processors execute low-level machine instructions that perform one or more of those principal operations. Each instruction does a tiny amount of work (like adding two numbers) but the processor can do them extremely fast, on the order of billions a second. Writing a program in these low-level machine instructions would be extremely tedious, so we typically use programming languages such as Python to make our lives easier.

To give you an idea of just how low-level these machine operations are, consider the following simple assignment statement.

total = cost + tax

Even something this simple requires the processor to execute multiple low-level instructions. The processor must look at the values of cost and tax in memory, add the two values, then store the result back into memory at the address associated with total.

Order of operations

Unless given instructions to the contrary, the processor keeps executing instructions one after the other. For example, given the following pseudocode sequence, a processor would execute the assignment statements and then execute the print.


In [1]:
cost = 100.0
tax = 8.1
total = cost + tax
print(total)


108.1

The notion of doing a sequence of operations in order is familiar to us from cooking with recipes. For example, a recipe might direct us to:

put ingredients in bowl
mix ingredients together
pour into baking pan
bake at 375 degrees for 40 minutes

We naturally assume the steps are given in execution order.

Conditional execution

Some recipes give conditional instructions, such as

if not sweet enough, add some sugar

Similarly, processors can conditionally execute one or a group of operations.


In [2]:
x = -3
if x<0:
    x = 0
    print("was negative")
print(x)


was negative
0

Conditional operations execute only if the conditional expression is true. To be clear, the processor does not execute all of the operations present in the program.

When mapping a real-world problem to a conditional statement, your goal is to identify these key elements:

  1. the conditional expression
  2. the operation(s) to perform if the condition is true

A template for conditional execution looks like:

if condition:
    operation 1
    operation 2
    ...

The condition must be actionable by a computer. For example, the condition "the cat is hungry" is not actionable by computer because there's nothing in its memory a computer can test that is equivalent to the cat being hungry. Conditions almost always consist of equality or relational operators for arithmetic, such as cost > 10, cost + tax < 100, or quantity == 4.

It's time to introduce a new data type: boolean, which holds either a true or false value. The result (value) of any equality or relational operator is boolean. For example, "3>2" evaluates to true and "3>4" evaluates to false.

In some cases, we want to execute an operation in either case, one operation if the condition is true and a different operation if the condition is false. For example, we can express a conditional operation to find the larger of two values, x and y, as:


In [3]:
x = 99
y = 210
if x > y: max = x
else: max = y
print(max)


210

Conditional execution is kind of like executing an operation zero or one times, depending on the conditional expression. We also need to execute operations multiple times in some cases.

Repeated execution

Most of the programming you will do involves applying operations to data, which means operating on data elements one by one. This means we need to be able to repeat instructions, but we rarely use such a low level construct as: while condition, do this or that. For example, it's rare that we want to print hello five times:


In [4]:
for i in range(5):
    print("Hello")


Hello
Hello
Hello
Hello
Hello

(range(n) goes from 0 to n-1)

On the other hand, generic loops that go around until the condition is met can be useful, such as "read data from the network until the data stops." Or, we can do computations such as the following rough approximation to log base 10:


In [5]:
n = 1000
log = 0
i = n
while i > 1:
    log += 1
    print(i)
    i = i / 10
print(f"Log of {n} is {log}")


1000
100.0
10.0
Log of 1000 is 3

Most of the time, though, we are scanning through data, which means a "for each loop" or "indexed for loop."

For-each loops

The for-each loop iterates through a sequence of elements, such as a list but can be any iteratable object, such as a file on the disk. It is the most common kind of loop you will use. Here is the pattern:

for x in iterable_object:
    process x in some way

For example:


In [18]:
for name in ['parrt', 'mary', 'tombu']:
    print(name.upper())


PARRT
MARY
TOMBU

We can iterate through more complicated objects too, such as the following numpy matrix:


In [37]:
import numpy as np
A = np.array([[19,11],
              [21,15],
              [103,18],
              [99,13],
              [8,2]])
for row in A:
    print(row)


[19 11]
[21 15]
[103  18]
[99 13]
[8 2]

Or even a dataframe:


In [40]:
from pandas import DataFrame

df = DataFrame(data=[[99,'parrt'],[101,'sri'],[42,'kayla']],
               columns=['ID','user'])
df


Out[40]:
ID user
0 99 parrt
1 101 sri
2 42 kayla

In [45]:
for row in df.itertuples():
    print(f"Info {row.ID:04d}: {row.user}")


Info 0099: parrt
Info 0101: sri
Info 0042: kayla

A useful bit of Python magic gives us the iteration index as well as the iterated value:


In [9]:
for i,row in enumerate(A):
    print(f"{i+1}. {row}")


1. [19 11]
2. [21 15]
3. [103  18]
4. [99 13]
5. [8 2]

In [26]:
# same as the enumerate
i = 1
for row in A:
    print(f"{i}. {row}")
    i += 1


1. [19 11]
2. [21 15]
3. [103  18]
4. [99 13]
5. [8 2]

With this code pattern, our goal is to find a good iterated variable named, identify the conditional, and then identify the operation(s) to repeat.

Indexed loops

It's sometimes useful to execute an indexed loop. These are useful when we have to iterate through multiple lists at the same time. For example, if we have a list of names and their phone numbers, we might want to use an indexed loop:


In [67]:
names = ['parrt', 'mary', 'tombu']
phones = ['5707', '1001', '3412']
for i in range(len(names)):
    name = names[i]
    phone = phones[i]
    print(f"{name:>8}: {phone}")


   parrt: 5707
    mary: 1001
   tombu: 3412

zip'd loops

And here is how the cool kids do the same thing without an indexed loop (using a foreach):


In [70]:
for name, phone in zip(names,phones):
    print(f"{name:>8}: {phone}")


   parrt: 5707
    mary: 1001
   tombu: 3412

In [74]:
for i, (name, phone) in enumerate(zip(names,phones)):
    print(f"{i}. {name:>8}: {phone}")


0.    parrt: 5707
1.     mary: 1001
2.    tombu: 3412

List comprehensions


In [12]:
names = ['parrt', 'mary', 'tombu']
[name.upper() for name in names]


Out[12]:
['PARRT', 'MARY', 'TOMBU']

In [76]:
[name for name in names] # make (shallow) copy of list


Out[76]:
['parrt', 'mary', 'tombu']

In [30]:
[name for name in names if name.startswith('m')]


Out[30]:
['mary']

In [28]:
[name.upper() for name in names if name.startswith('m')]


Out[28]:
['MARY']

In [14]:
Quantity = [6, 49, 27, 30, 19, 21, 12, 22, 21]
[q*10 for q in Quantity]


Out[14]:
[60, 490, 270, 300, 190, 210, 120, 220, 210]

In [15]:
Quantity2 = [6, 49, 0, 30, -19, 21, 12, 22, 21]
[q*10 for q in Quantity2 if q>0]


Out[15]:
[60, 490, 300, 210, 120, 220, 210]

In [16]:
# Find indexes of all values <= 0 in Quantity2
[i for i,q in enumerate(Quantity2) if q<=0]


Out[16]:
[2, 4]

In [36]:
[f"{name}: {phone}" for name, phone in zip(names,phones)]


Out[36]:
['parrt: 5707', 'mary: 1001', 'tombu: 3412']

Translating formulas

Sigma notation from mathematics translates in a straightforward fashion to indexed loops. For example:

We pick elements from the summations and insert them into the template for an indexed loop.

Exercises

Exercise: Given a string containing the digits of a number, such as s = "501", convert that number to an integer and print it out. Work backwards from the desired result, n. Recall that 501 = 5*100 + 0*10 + 1*1. Start with the simplest possible bit of Python and then tidy it up using any cool constructs you know from Python. Hint: The cool kids will end up using "Horner's rule." What happens if the string is empty? You can't use int('501') but you need int('5') on single digit.

Exercise: Reuse that pattern to convert a string containing binary digits of a binary number, such as s = "1101", to an integer, n, and print it out. 1101 binary is 13 in decimal. 1101 is $1*2^3 + 1*2^2 + 0*2^1 + 1*2^0$.

Exercise: Given two lists, such as a = [9, 3] and b = [1, 4, 10], create and print a new list, c, containing alternating elements from the input lists. In this case, the output would be [9, 1, 3, 4, 10]. Start by assuming the same number of elements and then try for the more general case. What happens if one or both lists are empty?

Exercise: Python has a built-in function called zip(a,b) that is a handy way to get a list of tuples containing elements from lists a and b. For example, if a = [9, 3] and b = [1, 4, 10], zip(a,b) gives a sequence of tuples (9, 1), (3, 4). The built in zip stops when one of the lists of runs out of elements, but we want to fill in missing elements with None: you should get output list c = [(9, 1), (3, 4), (None, 10)]. In this exercise, we use the ideas or even the code itself from the previous exercise to implement your own zip functionality. The only difference is that you should fill in missing elements with None.

If you get stuck, or just to check your answers, you can check my solutions.

Summary

Other than transferring data to and from memory, processors primarily perform arithmetic operations, such as "cost + tax". Processors can also conditionally or repeatedly execute operations.

When mapping real-world problems to pseudocode, you'll follow the program or function work plan and eventually work backwards from the desired result to identify a suitable sequence of operations. These operations will either map to our high level programming operations or to the lower level pseudocode patterns described here.

If you can't identify a higher level operation for a piece of the problem, try to map it to a conditional operation or a loop around one or more operations.

For conditionals, you have to identify the conditional Boolean expression and the operation or operations that should be executed conditionally:

if condition:
    operation 1
    operation 2
    ...

If you need to execute code in case that condition fails, use this template:

if condition:
    operation 1
    operation 2
    ...
else:
    operation 1
    operation 2
    ...

For repeated execution, we have a generic loop that executes one or more operations while a condition is met:

loop setup, usually init counter or value to update in loop
while condition:
    operation 1
    operation 2
    ...

A very common version of a loop traverses a sequence, such as a list, with a variable that takes on each value of the sequence one at a time:

for each x in sequence:
    operate on x

When iterating through multiple lists at the same time, we use an indexed list of the form:

for i in some integer_set or range:
    operation 1
    operation 2
    ...