Notebook-7: Lists

Notebook Content

In previous code camp notebooks we looked at numeric (integers and floats) and textual (strings) data, but it's probably been quite difficult to imagine how you'd assemble these simple data into something useful. The new data types (lists and dictionaries) will begin to show you how that can happen, and they will allow you to 'express' much more complex concepts with ease.

What's the Difference?

Up until now, our variables have only held one thing: a number, or a string.

In other words, we've had things like, myNumber = 5 or myString = "Hello world!". And that's it.

Now, with lists and dictionaries we can store multiple things: several numbers, several strings, or some mish-mash of both. That is, lists and disctionaries are data structures that can contain multuple data types.

Here's a brief summary of these data structures, highlighting their main difference:

  • A list is an ordered collection of 'items' (numbers, strings, etc.). You can ask for the first item in a list, the 3rd, or the 1,000th, it doesn't really matter because your list has an order and it keeps that order.
  • A dictionary is an unordered collection of 'items' (numbers, strings, etc.). You access items in a dictionary in a similar way to how you access a real dictionary: you have a 'key' (i.e. the word for which you want the definition) and you use this to look up the 'value' (i.e. the definition of the word).

There's obviously a lot more to lists and dictionaries than this, but it's a good starting point.

Let's start with lists in this notebook and we'll continue with dictionaries in the following.

In this Notebook

  • Indexing & Slicing
  • List operations
    • Addition and Mulitplication
    • You're (not) in the list!
  • List functions
    • insert
    • append
    • index
    • len

Lists

So a list is an ordered collection of items that we access by position (what Python calls the index) within that collection. So the first item in a list is always the first item in the list. End of (well, sort of... more on this later).

Because lists contain multiple data items, we create and use them differently from the simple (single item) variables we've seen so far. You can always spot a list because it is a series of items separated by commas and grouped together between a pair of square brackets ( [A, B, C, ..., n] ).

Here is a list of 4 items assigned to a variable called myList:


In [ ]:
myList = [1,2,4,5]
print(myList)

Lists are pretty versatile: they don't really care what kind of items you ask them to store. So you can use a list to hold items of all the other data types that we have seen so far!

Below we assign a new list to a variable called myList and then print it out so that you can see that it 'holds' all of these different types of data:


In [ ]:
myList = ['hi I am', 2312, 'mixing', 6.6, 90, 'strings, integers and floats']
print(myList)

We don't want to get too technical here, but it's important to note one thing: myList is still just one thing – a list – and not six things ('hi I am', 2312, 'mixing', 6.6, 90, 'strings, integers and floats'). So we can only assign one list to a variable called myList. It's like the difference between a person and a crowd: a crowd is one thing that holds many people inside it...

Indexing

To access an item in a list you use an index. This is a new and important concept and it's a term that we're going to use a lot in the future, so try to really get this term into your head before moving forward in the notebook.

Accessing Elements in a List using Indexes

The index is just the location of the item in the list that you want to access. So let's say that we want to fetch the second item, we access it via the index notation like so:


In [ ]:
myList = ['hi', 2312, 'mixing', 6.6, 90, 'strings, integers and floats' ]
print(myList[1])

See what has happened there? We have:

  1. Assigned a list with 6 elements to the the variable myList.
  2. Accessed the second element in the list using that element's index between a pair of square brackets (next to the list's name).

Wait a sec – didn't we say second element? Why then is the index 1???

Zero-Indexing

Good catch! That's because list indexes are zero-based: this is a fancy way to say that the count starts from 0 instead that from 1. So the first element has index 0, and the last element has index n-1 (i.e. the count of the number of items in the list [n] minus one). Zero indexing is a bit how like the ground floor in the UK is often shown as floor 0 in a lift.

To recap:


In [ ]:
myNewList = ['first', 'second', 'third']
print("The first element is: " + myNewList[0])
print("The third element is: " + myNewList[2])

Negative Indexing

Since programmers are lazy, they also have a short-cut for accessing the end of a list. Since positive numbers count from the start of a list, negative numbers count from the end:


In [ ]:
print(myNewList[-1])
print(myNewList[-2])

The last element has index -1. So you count forwards from 0, but backwards from -1.

You can remember it this way: the last item in the list is at n-1 (where n is the number of items in the list), so ...[-1] is a sensible way to access the last item.

A challenge for you!

Edit the code so that it prints the 'second' element in the list


In [ ]:
print("The second element is :" + myNewList[???])

Index Out of Range

What happens when you try to access an element that doesn't exist?

We know that myList has 3 elements, so what if we try to access the 200th element in the list? In that case Python, as usual, will inform us of the problem using an error message pointing to the problem:


In [ ]:
print(myNewList[200])

A challenge for you!

Do you remember the past lesson on syntax errors and exceptions? What is the error message displayed in the code above? Is it an exception or a syntax error? Can you find the explanation for what's going in the Official Documentation?

A String is a List?

Even if you didn't realise it, you have already been working with lists in a sense because strings are basically lists! Think about it this way: strings are an ordered sequence of characters because 'hello' and 'olhel' are very different words! It turns out that characters in a string can be accessed the same way we'd access a generic list.


In [ ]:
myString = "ABCDEF"
print(myString[0])
print(myString[-1])

Slicing

If you want to access more than one item at a time, then you can specify a range using two index values instead of just one.

If you provide two numbers, Python will assume you are indicating the start and end of a group of items. This operation is called list slicing, but keep in mind that indexes start from 0!

Note: remember too that the error above when we tried to get the 200th element was index out of range. So 'range' is how Python talks about more than one list element.


In [ ]:
shortSentence = "Now I'll just print THIS word, between the 20th and the 25th character: "
print(shortSentence[20:25])

A challenge for you!

Using the previous code as a guide, edit the code below so that it prints from the second to the fourth (inclusive) characters from the string:


In [ ]:
shortSentence2 = "A12B34c7.0"
print(shortSentence2[???:???])

To print the entirety of a list from any starting position onwards, just drop the second index value while leaving the : in your code:


In [ ]:
stringToPrint = "I will print from HERE onwards"
print("Starting from the 17th position: " + stringToPrint[17:])

Notice that there are two spaces between "position:" and "HERE" in the printout above? That's because the 17th character is a space. Let's make it a little more obvious:


In [ ]:
print("Starting from the 17th position: '" + stringToPrint[17:] + "'")

Got it?

A challenge for you!

Now, combining what we've seen above, how do you think you would print everything up to the eigth character from the end (which is the space between "HERE" and "onwards")?

You'll need to combine:

  1. Negative indexing
  2. List slicing

There are two ways to do it, one way uses only one number, the other uses two. Both are correct. Why don't you try to figure them both out? For 'Way 2' below the ??? is a placeholder for a full slicing operation since if we gave you more of a hint it would make it too obvious.


In [ ]:
print("Up to the 18th position (Way 1): '" + stringToPrint[???:???] + "'")
print("Up to the 18th position (Way 2): '" + ??? + "'")

Strings have also plenty of methods that might prove to be quite useful in the future; for a fuller overview check out this reference.

List operations

So far, we've only created a list, but just like a real to-do list most lists don't usually stay the same throughout the day (or the execution of your application). Their real value comes when we start to change them: adding and removing items, updating an existing item, concatenting several lists (i.e. sticking them together), etc.

Replacing an item

Here's how we replace an item in a list:


In [ ]:
myNewList = ['first', 'second', 'third']
print(myNewList)

# This replaces the item in the 2nd position
myNewList[1] = 'new element'

print(myNewList)

This shouldn't surprise you too much: it's just an assignment (via "=") after all!

So if you see list[1] on the right side of the assignement (the "=") then we are reading from a list, but if you see list[1] on the left side of the assignment then we are writing to a list.

Here's an example for a (small) party at a friend's place:


In [ ]:
theParty = ['Bob','Doug','Louise','Nancy','Sarah','Jane']
print(theParty)

theParty[1] = 'Phil' # Doug is replaced at the party by Phil
print(theParty)

theParty[0] = theParty[1]
print(theParty) # Phil is an evil genius and manages to also replace Doug with a Phil clone

Got it?

Addition and Multiplication

You can also operate on entire lists at one time, rather than just on their elements individually. For instance, given two lists you might want to add them together like so:


In [ ]:
britishProgrammers = ["Babbage", "Lovelace"]
nonBritishProgrammers = ["Torvald", "Knuth"]

famousProgrammers = britishProgrammers + nonBritishProgrammers
print(famousProgrammers)

You can even multiply them, although in this particular instance it is kind of pointless:


In [ ]:
print(britishProgrammers * 2)

A challenge for you!

Correct the syntax in the following code to properly define a new list:


In [ ]:
otherNonBritishProgrammers = ["Wozniak" ??? "Van Rossum"]

Edit the following code to print out all the non british programmers:


In [ ]:
print nonBritishProgrammers ??? otherNonBritishProgrammers

You're (not) in the list!

Ever stood outside a club or event and been told "You're not on/in the list"? Well, Python is like that too. In fact, Python tries as hard as possible to be like English – this isn't by accident, it's by design – and once you've done a bit of programming in Python you can start to guess how to do something by thinking about how you might say it in English.

So if you want to check if an item exists in a list you can use the in operator:

element in list

The in operator will return True if the item is present, and False otherwise. This is a new data type (called a Boolean) that we'll see a lot more of in two notebooks' time.


In [ ]:
print ('Lovelace' in britishProgrammers)
print ('Lovelace' in nonBritishProgrammers)

letters = ['a','b','c','d','e','f','g','h','i']
print ('e' in letters)
print ('z' in letters)

Note: You might also have spotted that this time there are parentheses ("(...)") after print. In general, as you become more experienced you'll always want to put parentheses after a print statement (because that's how Python3 works) but the intricacies of why this is the case are a bit out of the scope of an introductory set of lessons.

Anyway, if you want to check if an item does not exist in a list then you can use the not in operator. Let's go back to our party:


In [ ]:
print(theParty)
print('Bob' not in theParty)
print('Jane' not in theParty)

So here, that 'Boolean' gives us True on the first not in because it's "true that Bob isn't at the party" and False on the second one because it's "not true that Jane isn't at the party"! Double-negatives aren't supposed to exist in English, but they certainly do in programming!

A challenge for you!

Complete the missing bits of the following code so that we print out Ada Lovelace, her full name:


In [ ]:
firstProgrammerSurnames = ["Babbage", "Lovelace"]
firstProgrammerNames    = ["Charles", "Ada"]

firstProgrammerSurnames[1] = firstProgrammerNames[1] + " " + firstProgrammerSurnames[1]

print("Lady "+ ???[1] +" is considered to be the first programmer.")

Note : Actually, Lady Ada Lovelace is a fascinating person: she isn't just the first female programmer, she was the first programmer full-stop. For many years Charles Babbage got all the credit for inventing computers simply because he was a guy and designed a clever mechanical adding machine. However, lately we've realised that Ada was the one who actually saw that Babbage hadn't just invented a better abacus, he'd invented a general purpose computing device!

She was so far ahead of her time that the code she wrote down couldn't even run on Babbage's 'simple' (i.e. remarkably complex for the time) computer, but it is now recognised as the first computer algorithm. As a result, there is now a day in her honour every year that is celebrated around the world at places like Google and Facebook, as well as at King's and MIT, because we want to recognise the fundamental contribution to computing made by women programmers.

This contribution was long overlooked by the men who thought that the hard part was the machine, not the programming. Rather entertainingly (given the kudos attached to the people who created applications like Google and Facebook), most men thought that programming was just like typing and was therefore 'women's work'. So why not take a few minutes to recognise the important contributions of women to the history of computing.

Extending Lists

We've already seen that we can combine two lists using the + operator, but if you wanted to constantly add new items to your list then having to do something like this would be annoying:

myList = [] # An empty list

myList = myList + ['New item']
myList = myList + ['Another new item']
myList = myList + ['And another one!']

print(myList)

Not just annoying, but also hard to read! As with most things, because programmers are lazy there's an easier way to write this in Python:

myList = [] # An empty list

myList.append('New item')
myList.append('Another new item')
myList.append('And another one!')

print(myList)

Why don't you try typing it all in the coding area below? You'll get the same answer either way, but one is faster to write and easier to read!


In [ ]:

Appending to a list using append(...) is actually using something called a function. We've not really seen this concept before and we're not going to go into it in enormous detail here (there's a whole other notebook to introduce you to functions). The things that you need to notice at this point in time are:

  1. That square brackets ('[' and ']') are used for list indexing.
  2. That parentheses ('(' and ')') are (normally) used for function calls.

The best thing about functions is that they are like little packages of code that can do a whole bunch of things at once (e.g. add an item to a list by modifying the list directly), but you only need to remember to write append(...).

What did I mean just now by 'modifying the list directly'?

Notice that in the first example above we had to write:

myList = myList + ['New item']

because we had to write the result of concatening the two lists together back to the variable. The list isn't really growing, we're just overwriting what was already in myList with the results of the list addition operation. What do you think you would get if you wrote the following code:

myList = [] # An empty list

myList + ['New item']
myList + ['Another new item']
myList + ['And another one!']

print(myList)

If you aren't sure, why don't you try typing this into the coding area above and try to figure out why the answer is: ''.

Meanwhile, in the second example we could just write:

myList.append('New item')

and the change was made to myList directly! So this is easier to read and it is more like what we'd expect to happen: the list grows without us having to overwrite the old variable.

Other List Functions

append() is a function, and there are many other functions that can be applied to lists such as: len, insert and index. You tell Python to execute a function (sometimes also termed calling a function) by writing the function's name, followed by a set of parentheses. The parentheses will contain any optional inputs necessary for the function to do its job. Appending to a list wouldn't be very useful if you couldn't tell Python what to append in the first place!

The len function is also a good example of this:

len(theParty)

Here, the function len (lazy short-hand for length ) is passed theParty list as an input in order to do its magic.

The functions append, insert and index work a little bit differently. They have to be called using theParty list. We're at risk of joining Alice down the rabbit-hole here, so let's just leave it at: the second set of functions are of a particular type known as methods of the list class. We'll stop there.

In order to use a list method you always need to 'prepend' (i.e. lead with) the name of list you want to work with, like so:

theParty.append("New Guest")
theParty.insert(2, "Anastasia")
theParty.index("Sarah")

The idea here is that methods are associated with specific types of things such as lists, whereas generic functions are kind of free-floating. Think about it this way: you can't append something to a number, because that makes no sense. Appending is something that only makes sense in the context of a list. In contrast, something like len works on several different types of variables: lists and string both!

Append

Reminder: here's appending...


In [ ]:
britishProgrammers = ['Lovelace']
britishProgrammers.append("Turing")
print(britishProgrammers)

Insert

That's cool, but as you noticed append only ever inserts the new item as the last element in the list. What if you want it to go somewhere else?

With insert you can also specify a position


In [ ]:
print(nonBritishProgrammers)
nonBritishProgrammers.insert(1, "Swartz")
print(nonBritishProgrammers)

Index

Lastly, with the index method you can easily ask Python to find the position (index) of a given item:


In [ ]:
# Say you want to know in where "Knuth" is 
# in the list of non-British programmers...
print(nonBritishProgrammers.index("Knuth"))

A challenge for you!

Add the famous Grace Hopper (inventress of the first compiler!) to the list of British programmers, and then print her index position:


In [ ]:
nonBritishProgrammers.???("Hopper") 
print(nonBritishProgrammers.???("Hopper"))

Length

Cool, so those were some of the methods you can invoke on a list. Let's focus now on some functions that take lists as an input.

With the function len you can immediately know the len-gth of a given list:


In [ ]:
print( len(britishProgrammers) )

In [ ]:
length_of_list = len(nonBritishProgrammers)
print("There are " + str(length_of_list) + " elements in the list of non-British Programmers")

Did you see the str(length_of_list)? There's another function! We didn't draw attention to it before, we just told you to use str(5.0) to convert the float 5.0 to the string "5.0". We can tell it's a function because it uses the format functionName(...some input...). So the function name is str (as in convert to string) and the input is a number (in this case it's the length of the list nonBritishProgrammers). So now we can easily convert between different types of data: we took an integer (you can check this by adding the following line to the code above:

print length_of_list + 1

And then printing it out as a string. So length_of_list is a number, and by calling str(length_of_list) we changed it to a string that we could print out. Given that programmers are lazy, can you guess how you'd convert the string "3" to the integer 3?

A challenge for you!

Complete the missing bits of the following code:


In [ ]:
length_of_brits = ???(britishProgrammers)
print "There are " ??? " British programmers."

To check if the output of range() is a list we can use the type() function:

Code (Applied Geo-example)

Let's have a little play with some geographical coordinates. Based on what we've just done in this notebook, what do you think would be a good data type for storing lat/long coordinates?

Go on, I'll bet you can't guess!

Let's use what we know to jump to a particular point on the planet...


In [ ]:
# We'll see more of this package import
# later... for now, just know that it 
# provides access to a function called IFrame
from IPython.display import IFrame

# We want to view an OpenStreetMap map
siteName  = "http://www.openlinkmap.org/small.php"

# Specify the location and zoom level
latitude  = 63.6314
longitude = -19.6083
zoomLevel = 10
mapURL    = ['?lat=', str(latitude), '&lon=', str(longitude), '&zoom=', str(zoomLevel)]

# Show the ULR
print(siteName + ''.join(mapURL))

# And now show us an inline map!
IFrame(siteName + ''.join(mapURL), width='100%', height=400)

And now let's try somewhere closer to home. Try using the KCLMapCoordinates list below (which tells you the x/y and zoom-level) in combination with the code above. I'd suggest copying what you need from above, pasting it into the code below, and then using what you know about accessing list information so that the map shows the location of King's College London...


In [ ]:
# King's College coordinates
# What format are they in? Does it seem appropriate?
KCLMapCoordinates = [51.51130657591914, -0.11596798896789551, 15]

Credits!

Contributors:

The following individuals have contributed to these teaching materials:

License

The content and structure of this teaching project itself is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 license, and the contributing source code is licensed under The MIT License.

Acknowledgements:

Supported by the Royal Geographical Society (with the Institute of British Geographers) with a Ray Y Gildea Jr Award.

Potential Dependencies:

This notebook may depend on the following libraries: None