This notebook is based on one put together by [Jake Vanderplas](http://www.vanderplas.com) and has been modified to suit the purposes of this course, including expansion/modification of explanations and additional exercises. Source and license info for the original is on [GitHub](https://github.com/jakevdp/2014_fall_ASTR599/).

Names: [Insert Your Names Here]

Lab 6 - Advanced Data Structures

Lab 6 Contents

  1. Tuples
    • Defining Tuples
    • Indexing Tuples
    • Tuple Modification
  2. Lists
    • Defining Lists
    • Indexing Lists
    • Extending and Appending Lists
    • Searching, Sorting and Counting Lists
    • Exploring List Methods
    • Iterating over lists
    • The range function
    • Creating lists on the fly
  3. Sets
  4. Dictionaries
    • Defining Disctionaries
    • Dictionary Keys

This lab will introduce you to four new types of Python objects that allow you to collect data of arbitraty (and often mixed) type in Python, and these are known as "Sequence objects"

  • tuple : an immutable ordered array of data
  • list : a mutable ordered array of data
  • set : an unordered collection of unique elements
  • dict : an unordered mapping from keys to values

In [ ]:
from numpy import *

1. Tuples

1.1 Defining Tuples

Tuples are denoted with round parentheses


In [ ]:
t = (12, -1)
type(t)

If you'd like to test whether an object is a tuple (or any other type of object), you can use the python function isinstance


In [ ]:
isinstance(t,tuple)

In [ ]:
isinstance(t,list)

Tuples have lengths just like other types of Python objects


In [ ]:
print(len(t))

and you can mix types within a tuple


In [ ]:
t = (12, "monty", True, -1.23e6)
t

1.2 Indexing Tuples

Indexing works the same way as for arrays:


In [ ]:
t[0]

In [ ]:
t[-1]

In [ ]:
t[-2:]  # get the last two elements, return as a tuple

Single element tuples look like (element,) rather than (element)


In [ ]:
x = (True) ; print(type(x))  #by the way, did you know you can execute two commands on one line with a semicolon?
x = (True,) ; print(type(x))

In [ ]:
x = ()
type(x), len(x) #and you can also return multiple things to the output of a notebook cell with commas

1.3 Tuple Modification

Tuples cannot be modified. The following cell will spit out an error.


In [ ]:
t[2] = False

but you can create a new tuple by combining elements from other tuples


In [ ]:
newt = t[0:2], False, t[3:]
type(newt), newt

Note the above did something, but not exactly what you might think. It created a three element tuple, where the first (index 0) and third (index 2) elements are themselves tuples


In [ ]:
len(newt), type(newt[0]), type(newt[1]), type(newt[2])

This can have its uses, but more often you will want to create a tuple identical to the original but with different elements, for which we use concatenation instead, just like we did with strings.

But concatenation is tricky. What's wrong with the following statement?


In [ ]:
t[0:2] + False + t[3:]

similarly:


In [ ]:
'I can not concatenate things like ' + 7 + ' and ' + 'monkeys'

You can only concatenate objects of the same type, so you have to use the trick for a single element tuple, as described above


In [ ]:
y = t[0:2] + (False,) + t[3:]
y

So tuples are immutable, but not indestructible. Once we've defined a new one, we can assign it to x and overwrite the original if we really want to


In [ ]:
x=y
x

Similarly, we could have done this without assigning a the new variable, but note that this erases memory if the original.


In [ ]:
t = t[0:2] + (False,) + t[3:]
t

Like strings, you can also "multiply" tuples to duplicate elements


In [ ]:
t * 2

Tuples are most commonly used in functions that return multiple arguments.

2. Lists

2.1 Defining Lists

Python lists are denoted with square brackets. We've dealt with them indirectly a bit already in this class, but it's worth discussing them explicitly here.


In [ ]:
v = [1,2,3]
print(len(v))
print(type(v))

2.2 Indexing Lists

Lists can be indexed


In [ ]:
v[0:2], v[-1]

In [ ]:
v = v[2:]
print(v)

Lists can contain multiple data types, including tuples and other lists


In [ ]:
v = ["eggs", "spam", -1, ("monty","python"), [-1.2,-3.5]]
len(v)

Unlike tuples, however, lists are mutable.


In [ ]:
v[0] ="green egg"
v[1] += ",love it." # this takes what's already in v[1] and adds what comes after +=
v

You can index multi-element objects within a list as well, but in this case, you index variable[list element index][index of thing you want], as below. Note this is slightly different from the way you were taught to index a numpy array with arrayname[column,row], but the same syntax actually works with numpy arrays (arrayname[column][row])


In [ ]:
v[-1][1] = None
print(v)

In [ ]:
z = array([[1,2],[3,4]])
z[0][1], z[0,1]

Sidebar: A Note on lists vs. arrays

In fact, lists can be made to look a lot like numpy arrays (e.g. vv = [ [1,2], [3,4] ] makes a list that looks just like the numpy array above), but it's important to note that the properties of a list object are slightly diffferent. Specifically:

  • Since a list contains pointers to a bunch of python objects, it takes more memory to store an array in list format than as an array (which points to a single object in memory). Operations on large arrays will be much faster than on equivalent lists, because list operations require a variety of type checks, etc.
  • Many mathematical operations, particularly matrix operations, will only work on numpy arrays
  • Lists support insertion, deletion, appending and concatenation in ways that arrays do not, as detailed in the next section

So each is useful for its own thing. Lists are useful for storing mixed type objects associated with one another and their mutability allows insertion, deletion, etc. Arrays are useful for storing and operating on large matrices of numbers.

2.3 Extending and Appending Lists

Useful list methods:

  • .append(): adds a new element
  • .extend(): concatenates a list/element
  • .pop(): remove an element

In [ ]:
v = [1,2,3]
v.append(4)
v.append([-5])
v

Note: lists can be considered objects. Objects are collections of data and associated methods. In the case of a list, append is a method: it is a function associated with the object.


In [ ]:
v = v[:4]
w = ['elderberries', 'eggs']
v + w

In [ ]:
v

In [ ]:
v.extend(w)
v

In [ ]:
z = v.pop()
z

In [ ]:
v

In [ ]:
v.pop(0) ## pop the first element

In [ ]:
v

2.4 Searching, Sorting, and Counting Lists


In [ ]:
v = [1, 3, 2, 3, 4]
v.sort()
v

reverse is a keyword of the .sort() method


In [ ]:
v.sort(reverse=True)
v

.sort() changes the the list in place


In [ ]:
v.index(4)   ## lookup the index of the entry 4

In [ ]:
v.index(3)

In [ ]:
v.count(3)

In [ ]:
v.insert(0, "it's full of stars")
v

In [ ]:
v.remove(1)
v

2.5 Exploring List Methods

Jupyter is your new best friend: it's tab-completion allows you to explore all methods available to an object. (This only works in jupyter, not in the command line)

Type

v.

and then the tab key to see all the available methods:


In [ ]:
v.

Once you find a method, type (for example)

v.index?

and press shift-enter: you'll see the documentation of the method


In [ ]:
v.index?

This is probably the most important thing you'll learn today

2.6 Iterating Over Lists


In [ ]:
a = ['cat', 'window', 'defenestrate']
for x in a:
    print(x, len(x))

In [ ]:
#enumerate is a useful command that returns ordered pairs of the form 
#(index, array element) for all of the elements in a list
for i,x in enumerate(a):
    print(i, x, len(x))

In [ ]:
# print all the elements in the list with spaces between
for x in a:
    print(x, end=' ')

The syntax for iteration is...

for variable_name in iterable:
   # do something with variable_name

2.7 The range() function

The range() function creates a list of integers

(actually an iterator, but think of it as a list)


In [ ]:
x = range(4)
x

In [ ]:
total = 0
for val in range(4):
    total += val
    print("By adding " + str(val) + \
          " the total is now " + str(total))

range([start,] stop[, step]) → list of integers


In [ ]:
total = 0
for val in range(1, 10, 2):
    total += val
    print("By adding " + str(val) + \
          " the total is now " + str(total))

In practice, this is equivalent to the python arange command that you've already seen, but note that arange creates a numpy array with all of the elements between the start and stop point, and is therefore more efficient for large loops. Still, it's usefult to be aware of range as well. arange can also be used on non-integers, which is quite useful. Note that the second cell below will result in an error.


In [ ]:
y = arange(4)
y

In [ ]:
z = range(0,10,0.1)

2.8 Creating Lists on-the-fly

Example: imagine you want a list of all numbers from 0 to 100 which are divisible by 7 or 11.


In [ ]:
L = []  #before populating the list, you must first define it!
for num in range(100):
    if (num % 7 == 0) or (num % 11 == 0):  #recall that % is the "mod" function
        L.append(num)
print(L)

We can also do this with a list comprehension:


In [ ]:
L = [num for num in range(100) if (num % 7 == 0) or (num % 11 == 0)]
print(L)

In [ ]:
# Can also operate on each element:
L = [2 * num for num in range(100) if (num % 7 == 0) or (num % 11 == 0)]
print(L)

Exercise 1


Write a loop over the words in this list and print the words longer than three characters in length:


In [ ]:
L = ["Oh", "Say", "does", "that", "star",
     "spangled", "banner", "yet", "wave"]

3. Sets

Sets can be thought of as unordered lists of unique items

Sets are denoted with a curly braces


In [ ]:
{1,2,3,"bingo"}

The uniqueness aspect is the key here. Note that the output of the cell below is the same as the one above.


In [ ]:
{1,2,3,"bingo",3}

In [ ]:
type({1,2,3,"bingo"})

The set function will make a set out of whatever is provided.


In [ ]:
set("spamIam")

sets have unique elements. They can be compared, differenced, unionized, etc.


In [ ]:
a = set("sp")
b = set("am")
print(a, b)

In [ ]:
c = set(["a","m"])
c == b

In [ ]:
"p" in a

In [ ]:
a | b

4. Dictionaries

4.1 Defining Dictionaries

Dictionaries are one-to-one mappings of objects. They are often useful when you want to assign multiple named properties to individuals. Each entry in a dictionary has a set of "keys" that can be assigned unique values.

We'll show four ways to make a Dictionary


In [ ]:
# number 1... curly braces & colons
d = {"favorite cat": None,
     "favorite spam": "all"}
d

In [ ]:
# number 2
d = dict(one = 1, two=2, cat='dog')
d

In [ ]:
# number 3 ... just start filling in items/keys
d = {}  # empty dictionary
d['cat'] = 'dog'
d['one'] = 1
d['two'] = 2
d

In [ ]:
# number 4... start with a list of tuples and then use the dict function to create a dictionary with them
mylist = [("cat","dog"), ("one",1), ("two",2)]
dict(mylist)

In [ ]:
dict(mylist) == d

4.2 Dictionary Keys

Note that there is no guaranteed order in a dictionary, thus they cannot be indexed numerically!


In [ ]:
d = {"favorite cat": None, "favorite spam": "all"}

In [ ]:
d[0]  # this breaks!  Dictionaries have no order

They can, however, be indexed with an appropriate key.


In [ ]:
d["favorite spam"]

and the following syntax results in a key called "0"


In [ ]:
d[0] = "this is a zero"
d

Dictionaries can contain dictionaries!


In [ ]:
d = {'favorites': {'cat': None, 'spam': 'all'},\
     'least favorite': {'cat': 'all', 'spam': None}}
d['least favorite']['cat']

note: the backslash ('\') above allows you to break lines without interrupting the code. Not technically needed when defining a dictionary or list, but useful in many instances when you have a long operation that is unwiedy in a single line of code

Dictionaries are used everywhere within Python...


In [ ]:
# globals() and locals() store all global and local variables (in this case, since we've imported numpy, quite a few)
globals().keys()

Exercise 2


Below is a list of information on 50 of the largest near-earth asteroids.

(a) Given this list of asteroid information, find and list all asteroids with semi-major axis (a) within 0.2AU of earth, and with eccentricities (e) less than 0.5.

(b) Note that the object below is a list (denoted with square brackets) of tuples (denoted with round brackets), and that the orbit class object is a dictionary. Create a dictionary where the name of each asteroid is the key, and the object stored under that key is a three element tuple (semi-major axis (AU), eccentricity, orbit class).

(c) using the list (and not the dictionary), print the list of asteroids according to:
(i) alphabetical by asteroid name
(ii) in order of increasing semi-major axis
(iii) in order of increasing eccentricity
(iv) alphabetically by class (two-stage sorting)

hint: use the "sorted" function rather than object.sort, and check out the function "itemgetter" from the python module "operator" 
Bonus points if you can get it to print with the columns lined up nicely!

In [ ]:
# Each element is (name, semi-major axis (AU), eccentricity, orbit class)
# source: http://ssd.jpl.nasa.gov/sbdb_query.cgi

Asteroids = [('Eros', 1.457916888347732, 0.2226769029627053, 'AMO'),
             ('Albert', 2.629584157344544, 0.551788195302116, 'AMO'),
             ('Alinda', 2.477642943521562, 0.5675993715753302, 'AMO'),
             ('Ganymed', 2.662242764279804, 0.5339300994578989, 'AMO'),
             ('Amor', 1.918987277620309, 0.4354863345648127, 'AMO'),
             ('Icarus', 1.077941311539208, 0.826950446001521, 'APO'),
             ('Betulia', 2.196489260519891, 0.4876246891992282, 'AMO'),
             ('Geographos', 1.245477192797457, 0.3355407124897842, 'APO'),
             ('Ivar', 1.862724540418448, 0.3968541470639658, 'AMO'),
             ('Toro', 1.367247622946547, 0.4358829575017499, 'APO'),
             ('Apollo', 1.470694262588244, 0.5598306817483757, 'APO'),
             ('Antinous', 2.258479598510079, 0.6070051516585434, 'APO'),
             ('Daedalus', 1.460912865705988, 0.6144629118218898, 'APO'),
             ('Cerberus', 1.079965807367047, 0.4668134997419173, 'APO'),
             ('Sisyphus', 1.893726635847921, 0.5383319204425762, 'APO'),
             ('Quetzalcoatl', 2.544270656955212, 0.5704591861565643, 'AMO'),
             ('Boreas', 2.271958775354725, 0.4499332278634067, 'AMO'),
             ('Cuyo', 2.150453953345012, 0.5041719257675564, 'AMO'),
             ('Anteros', 1.430262719980132, 0.2558054402785934, 'AMO'),
             ('Tezcatlipoca', 1.709753263222791, 0.3647772103513082, 'AMO'),
             ('Midas', 1.775954494579457, 0.6503697243919138, 'APO'),
             ('Baboquivari', 2.646202507670927, 0.5295611095751231, 'AMO'),
             ('Anza', 2.26415089613359, 0.5371603112900858, 'AMO'),
             ('Aten', 0.9668828078092987, 0.1827831025175614, 'ATE'),
             ('Bacchus', 1.078135348117527, 0.3495569270441645, 'APO'),
             ('Ra-Shalom', 0.8320425524852308, 0.4364726062545577, 'ATE'),
             ('Adonis', 1.874315684524321, 0.763949321566, 'APO'),
             ('Tantalus', 1.289997492877751, 0.2990853014998932, 'APO'),
             ('Aristaeus', 1.599511990737142, 0.5030618532252225, 'APO'),
             ('Oljato', 2.172056090036035, 0.7125729402616418, 'APO'),
             ('Pele', 2.291471988746353, 0.5115484924883255, 'AMO'),
             ('Hephaistos', 2.159619960333728, 0.8374146846143349, 'APO'),
             ('Orthos', 2.404988778495748, 0.6569133796135244, 'APO'),
             ('Hathor', 0.8442121506103012, 0.4498204013480316, 'ATE'),
             ('Beltrovata', 2.104690977122337, 0.413731105995413, 'AMO'),
             ('Seneca', 2.516402574514213, 0.5708728441169761, 'AMO'),
             ('Krok', 2.152545170235639, 0.4478259793515817, 'AMO'),
             ('Eger', 1.404478323548423, 0.3542971360331806, 'APO'),
             ('Florence', 1.768227407864309, 0.4227761019048867, 'AMO'),
             ('Nefertiti', 1.574493139339916, 0.283902719273878, 'AMO'),
             ('Phaethon', 1.271195939723604, 0.8898716672181355, 'APO'),
             ('Ul', 2.102493486378346, 0.3951143067760007, 'AMO'),
             ('Seleucus', 2.033331705805067, 0.4559159977082651, 'AMO'),
             ('McAuliffe', 1.878722427225527, 0.3691521497610656, 'AMO'),
             ('Syrinx', 2.469752836845105, 0.7441934504192601, 'APO'),
             ('Orpheus', 1.209727780883745, 0.3229034563257626, 'APO'),
             ('Khufu', 0.989473784873371, 0.468479627898914, 'ATE'),
             ('Verenia', 2.093231870619781, 0.4865133359612604, 'AMO'),
             ('Don Quixote', 4.221712367193639, 0.7130894892477316, 'AMO'),
             ('Mera', 1.644476057737928, 0.3201425983025733, 'AMO')]

orbit_class = {'AMO':'Amor', 'APO':'Apollo', 'ATE':'Aten'}

In [ ]:


In [1]:
from IPython.core.display import HTML
def css_styling():
    styles = open("../custom.css", "r").read()
    return HTML(styles)
css_styling()


Out[1]: