Now that we know how computers store data in memory, we need a basic understanding of how computers process that data. Let's explore the simplest, fine-grained operations that a computer can perform. Ultimately, we will draw from these operations to design programs. Programmers think in terms of high-level operations, such as map, search, or filter, but our fingers type fine-grained code patterns associated with those high-level operations.
As we saw in Representing data in memory, a computer's memory holds data temporarily while the processor works on that data. We typically load data into memory from the disk and organize it into a structure that is suitable for the computation we'd like to perform. In the end, though, memory just holds data. All of the action happens in the computer processor (CPU), which performs five principal operations:
Processors execute low-level machine instructions that perform one or more of those principal operations. Each instruction does a tiny amount of work (like adding two numbers) but the processor can do them extremely fast, on the order of billions a second. Writing a program in these low-level machine instructions would be extremely tedious, so we typically use programming languages such as Python to make our lives easier.
To give you an idea of just how low-level these machine operations are, consider the following simple assignment statement.
total = cost + tax
Even something this simple requires the processor to execute multiple low-level instructions. The processor must look at the values of cost
and tax
in memory, add the two values, then store the result back into memory at the address associated with total
.
Unless given instructions to the contrary, the processor keeps executing instructions one after the other. For example, given the following pseudocode sequence, a processor would execute the assignment statements and then execute the print.
In [1]:
cost = 100.0
tax = 8.1
total = cost + tax
print(total)
The notion of doing a sequence of operations in order is familiar to us from cooking with recipes. For example, a recipe might direct us to:
put ingredients in bowl
mix ingredients together
pour into baking pan
bake at 375 degrees for 40 minutes
We naturally assume the steps are given in execution order.
Some recipes give conditional instructions, such as
if not sweet enough, add some sugar
Similarly, processors can conditionally execute one or a group of operations.
In [2]:
x = -3
if x<0:
x = 0
print("was negative")
print(x)
Conditional operations execute only if the conditional expression is true. To be clear, the processor does not execute all of the operations present in the program.
When mapping a real-world problem to a conditional statement, your goal is to identify these key elements:
A template for conditional execution looks like:
if condition:
operation 1
operation 2
...
The condition must be actionable by a computer. For example, the condition "the cat is hungry" is not actionable by computer because there's nothing in its memory a computer can test that is equivalent to the cat being hungry. Conditions almost always consist of equality or relational operators for arithmetic, such as cost > 10
, cost + tax < 100
, or quantity == 4
.
It's time to introduce a new data type: boolean, which holds either a true or false value. The result (value) of any equality or relational operator is boolean. For example, "3>2" evaluates to true and "3>4" evaluates to false.
In some cases, we want to execute an operation in either case, one operation if the condition is true and a different operation if the condition is false. For example, we can express a conditional operation to find the larger of two values, x and y, as:
In [3]:
x = 99
y = 210
if x > y: max = x
else: max = y
print(max)
Conditional execution is kind of like executing an operation zero or one times, depending on the conditional expression. We also need to execute operations multiple times in some cases.
Most of the programming you will do involves applying operations to data, which means operating on data elements one by one. This means we need to be able to repeat instructions, but we rarely use such a low level construct as: while condition, do this or that. For example, it's rare that we want to print hello five times:
In [4]:
for i in range(5):
print("Hello")
(range(n)
goes from 0 to n
-1)
On the other hand, generic loops that go around until the condition is met can be useful, such as "read data from the network until the data stops." Or, we can do computations such as the following rough approximation to log base 10:
In [5]:
n = 1000
log = 0
i = n
while i > 1:
log += 1
print(i)
i = i / 10
print(f"Log of {n} is {log}")
Most of the time, though, we are scanning through data, which means a "for each loop" or "indexed for loop."
The for-each loop iterates through a sequence of elements, such as a list but can be any iteratable object, such as a file on the disk. It is the most common kind of loop you will use. Here is the pattern:
for x in iterable_object:
process x in some way
For example:
In [18]:
for name in ['parrt', 'mary', 'tombu']:
print(name.upper())
We can iterate through more complicated objects too, such as the following numpy matrix:
In [37]:
import numpy as np
A = np.array([[19,11],
[21,15],
[103,18],
[99,13],
[8,2]])
for row in A:
print(row)
Or even a dataframe:
In [40]:
from pandas import DataFrame
df = DataFrame(data=[[99,'parrt'],[101,'sri'],[42,'kayla']],
columns=['ID','user'])
df
Out[40]:
In [45]:
for row in df.itertuples():
print(f"Info {row.ID:04d}: {row.user}")
A useful bit of Python magic gives us the iteration index as well as the iterated value:
In [9]:
for i,row in enumerate(A):
print(f"{i+1}. {row}")
In [26]:
# same as the enumerate
i = 1
for row in A:
print(f"{i}. {row}")
i += 1
With this code pattern, our goal is to find a good iterated variable named, identify the conditional, and then identify the operation(s) to repeat.
It's sometimes useful to execute an indexed loop. These are useful when we have to iterate through multiple lists at the same time. For example, if we have a list of names and their phone numbers, we might want to use an indexed loop:
In [67]:
names = ['parrt', 'mary', 'tombu']
phones = ['5707', '1001', '3412']
for i in range(len(names)):
name = names[i]
phone = phones[i]
print(f"{name:>8}: {phone}")
In [70]:
for name, phone in zip(names,phones):
print(f"{name:>8}: {phone}")
In [74]:
for i, (name, phone) in enumerate(zip(names,phones)):
print(f"{i}. {name:>8}: {phone}")
In [12]:
names = ['parrt', 'mary', 'tombu']
[name.upper() for name in names]
Out[12]:
In [76]:
[name for name in names] # make (shallow) copy of list
Out[76]:
In [30]:
[name for name in names if name.startswith('m')]
Out[30]:
In [28]:
[name.upper() for name in names if name.startswith('m')]
Out[28]:
In [14]:
Quantity = [6, 49, 27, 30, 19, 21, 12, 22, 21]
[q*10 for q in Quantity]
Out[14]:
In [15]:
Quantity2 = [6, 49, 0, 30, -19, 21, 12, 22, 21]
[q*10 for q in Quantity2 if q>0]
Out[15]:
In [16]:
# Find indexes of all values <= 0 in Quantity2
[i for i,q in enumerate(Quantity2) if q<=0]
Out[16]:
In [36]:
[f"{name}: {phone}" for name, phone in zip(names,phones)]
Out[36]:
Sigma notation from mathematics translates in a straightforward fashion to indexed loops. For example:
We pick elements from the summations and insert them into the template for an indexed loop.
Exercise: Given a string containing the digits of a number, such as s = "501"
, convert that number to an integer and print it out. Work backwards from the desired result, n
. Recall that 501 = 5*100 + 0*10 + 1*1
. Start with the simplest possible bit of Python and then tidy it up using any cool constructs you know from Python. Hint: The cool kids will end up using "Horner's rule." What happens if the string is empty? You can't use int('501')
but you need int('5')
on single digit.
Exercise: Reuse that pattern to convert a string containing binary digits of a binary number, such as s = "1101"
, to an integer, n
, and print it out. 1101 binary is 13 in decimal. 1101 is $1*2^3 + 1*2^2 + 0*2^1 + 1*2^0$.
Exercise: Given two lists, such as a = [9, 3]
and b = [1, 4, 10]
, create and print a new list, c
, containing alternating elements from the input lists. In this case, the output would be [9, 1, 3, 4, 10]
. Start by assuming the same number of elements and then try for the more general case. What happens if one or both lists are empty?
Exercise: Python has a built-in function called zip(a,b)
that is a handy way to get a list of tuples containing elements from lists a
and b
. For example, if a = [9, 3]
and b = [1, 4, 10]
, zip(a,b)
gives a sequence of tuples (9, 1), (3, 4)
. The built in zip
stops when one of the lists of runs out of elements, but we want to fill in missing elements with None
: you should get output list c = [(9, 1), (3, 4), (None, 10)]
. In this exercise, we use the ideas or even the code itself from the previous exercise to implement your own zip
functionality. The only difference is that you should fill in missing elements with None
.
If you get stuck, or just to check your answers, you can check my solutions.
Other than transferring data to and from memory, processors primarily perform arithmetic operations, such as "cost + tax". Processors can also conditionally or repeatedly execute operations.
When mapping real-world problems to pseudocode, you'll follow the program or function work plan and eventually work backwards from the desired result to identify a suitable sequence of operations. These operations will either map to our high level programming operations or to the lower level pseudocode patterns described here.
If you can't identify a higher level operation for a piece of the problem, try to map it to a conditional operation or a loop around one or more operations.
For conditionals, you have to identify the conditional Boolean expression and the operation or operations that should be executed conditionally:
if condition:
operation 1
operation 2
...
If you need to execute code in case that condition fails, use this template:
if condition:
operation 1
operation 2
...
else:
operation 1
operation 2
...
For repeated execution, we have a generic loop that executes one or more operations while a condition is met:
loop setup, usually init counter or value to update in loop
while condition:
operation 1
operation 2
...
A very common version of a loop traverses a sequence, such as a list, with a variable that takes on each value of the sequence one at a time:
for each x in sequence:
operate on x
When iterating through multiple lists at the same time, we use an indexed list of the form:
for i in some integer_set or range:
operation 1
operation 2
...