This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com) for UW's [Astro 599](http://www.astro.washington.edu/users/vanderplas/Astr599_2014/) course. Source and license info is on [GitHub](https://github.com/jakevdp/2014_fall_ASTR599/).


In [1]:
%run talktools.py


Advanced Data Structures

There are four types of collections in Python

(known as "Sequence objects")

  • list : a mutable ordered array of data
  • tuple : an immutable ordered array of data
  • dict : an unordered mapping from keys to values
  • set : an unordered collection of unique elements

The values in any of these collections can be arbitrary Python objects, and mixing content types is OK.

Note that strings are also sequence objects.

Tuples

Tuples are denoted with parentheses


In [2]:
t = (12, -1)
print(type(t))


<class 'tuple'>

In [3]:
print(isinstance(t,tuple))
print(len(t))


True
2

Can mix types in a tuple


In [4]:
t = (12, "monty", True, -1.23e6)
print(t[1])


monty

Indexing works the same way as for strings:


In [5]:
print(t[-1])


-1230000.0

In [6]:
t[-2:]  # get the last two elements, return as a tuple


Out[6]:
(True, -1230000.0)

In [7]:
x = (True) ; print(type(x))
x = (True,) ; print(type(x))


<class 'bool'>
<class 'tuple'>

In [8]:
x = ()
type(x), len(x)


Out[8]:
(tuple, 0)

In [9]:
x = (,)


  File "<ipython-input-9-bd05d59e2976>", line 1
    x = (,)
         ^
SyntaxError: invalid syntax

single-element tuples look like (element,)

tuples cannot be modified. but you can create new one with concatenation


In [10]:
t[2] = False


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-9365ccccf007> in <module>()
----> 1 t[2] = False

TypeError: 'tuple' object does not support item assignment

In [11]:
t[0:2], False, t[3:]


Out[11]:
((12, 'monty'), False, (-1230000.0,))

In [12]:
## the above is
## not what we wanted... need to concatenate
t[0:2] + False + t[3:]


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-aaeb0198f3bf> in <module>()
      1 ## the above is
      2 ## not what we wanted... need to concatenate
----> 3 t[0:2] + False + t[3:]

TypeError: can only concatenate tuple (not "bool") to tuple

In [13]:
y = t[0:2] + (False,) + t[3:]
y


Out[13]:
(12, 'monty', False, -1230000.0)

In [14]:
t * 2


Out[14]:
(12, 'monty', True, -1230000.0, 12, 'monty', True, -1230000.0)

Tuples are most commonly used in functions which return multiple arguments.

Lists

Lists are denoted with square brackets


In [15]:
v = [1,2,3]
print(len(v))
print(type(v))


3
<class 'list'>

In [16]:
v[0:2]


Out[16]:
[1, 2]

In [17]:
v = ["eggs", "spam", -1, ("monty","python"), [-1.2,-3.5]]
len(v)


Out[17]:
5

In [18]:
v[0] ="green egg"
v[1] += ",love it."
v[-1]


Out[18]:
[-1.2, -3.5]

In [19]:
v[-1][1] = None
print(v)


['green egg', 'spam,love it.', -1, ('monty', 'python'), [-1.2, None]]

In [20]:
v = v[2:]
print(v)


[-1, ('monty', 'python'), [-1.2, None]]

In [21]:
# let's make a proto-array out of nested lists
vv = [ [1,2], [3,4] ]

In [22]:
len(vv)


Out[22]:
2

In [23]:
determinant = vv[0][0] * vv[1][1] - vv[0][1] * vv[1][0]
determinant


Out[23]:
-2

the main point here: lists are mutable

Lists can be Extended and Appended


In [24]:
v = [1,2,3]
v.append(4)
v.append([-5])
v


Out[24]:
[1, 2, 3, 4, [-5]]

Note: lists can be considered objects. Objects are collections of data and associated methods. In the case of a list, append is a method: it is a function associated with the object.


In [25]:
v = v[:4]
w = ['elderberries', 'eggs']
v + w


Out[25]:
[1, 2, 3, 4, 'elderberries', 'eggs']

In [26]:
v


Out[26]:
[1, 2, 3, 4]

In [27]:
v.extend(w)
v


Out[27]:
[1, 2, 3, 4, 'elderberries', 'eggs']

In [28]:
v.pop()


Out[28]:
'eggs'

In [29]:
v


Out[29]:
[1, 2, 3, 4, 'elderberries']

In [30]:
v.pop(0) ## pop the first element


Out[30]:
1

In [31]:
v


Out[31]:
[2, 3, 4, 'elderberries']

Useful list methods:

  • .append(): adds a new element
  • .extend(): concatenates a list/element
  • .pop(): remove an element

Lists can be searched, sorted, & counted


In [32]:
v = [1, 3, 2, 3, 4]
v.sort()
v


Out[32]:
[1, 2, 3, 3, 4]

reverse is a keyword of the .sort() method


In [33]:
v.sort(reverse=True)
v


Out[33]:
[4, 3, 3, 2, 1]

.sort() changes the the list in place


In [34]:
v.index(4)   ## lookup the index of the entry 4


Out[34]:
0

In [35]:
v.index(3)


Out[35]:
1

In [36]:
v.count(3)


Out[36]:
2

In [37]:
v.insert(0, "it's full of stars")
v


Out[37]:
["it's full of stars", 4, 3, 3, 2, 1]

In [38]:
v.remove(1)
v


Out[38]:
["it's full of stars", 4, 3, 3, 2]

Using IPython to learn more

IPython is your new best friend: it's tab-completion allows you to explore all methods available to an object. (This only works in IPython)

Type

v.

and then the tab key to see all the available methods:


In [ ]:
v.

Once you find a method, type (for example)

v.index?

and press shift-enter: you'll see the documentation of the method


In [40]:
v.index?

This is probably the most important thing you'll learn today

Iterating over Lists


In [41]:
a = ['cat', 'window', 'defenestrate']
for x in a:
    print(x, len(x))


cat 3
window 6
defenestrate 12

In [42]:
for i,x in enumerate(a):
    print(i, x, len(x))


0 cat 3
1 window 6
2 defenestrate 12

In [43]:
for x in a:
    print(x, end=' ')


cat window defenestrate 

The syntax for iteration is...

for variable_name in iterable:
   # do something with variable_name

The range() function

The range() function creates a list of integers

(actually an iterator, but think of it as a list)


In [44]:
x = range(4)
x


Out[44]:
range(0, 4)

In [45]:
total = 0
for val in range(4):
    total += val
    print("By adding " + str(val) + \
          " the total is now " + str(total))


By adding 0 the total is now 0
By adding 1 the total is now 1
By adding 2 the total is now 3
By adding 3 the total is now 6

range([start,] stop[, step]) → list of integers


In [46]:
total = 0
for val in range(1, 10, 2):
    total += val
    print("By adding " + str(val) + \
          " the total is now " + str(total))


By adding 1 the total is now 1
By adding 3 the total is now 4
By adding 5 the total is now 9
By adding 7 the total is now 16
By adding 9 the total is now 25

Quick Exercise:

Write a loop over the words in this list and print the words longer than three characters in length:


In [47]:
L = ["Oh", "Say", "does", "that", "star",
     "spangled", "banner", "yet", "wave"]

Sets

Sets can be thought of as unordered lists of unique items

Sets are denoted with a curly braces


In [48]:
{1,2,3,"bingo"}


Out[48]:
{1, 2, 3, 'bingo'}

In [49]:
type({1,2,3,"bingo"})


Out[49]:
set

In [50]:
type({})


Out[50]:
dict

In [51]:
type(set())


Out[51]:
set

In [52]:
set("spamIam")


Out[52]:
{'I', 'a', 'm', 'p', 's'}

sets have unique elements. They can be compared, differenced, unionized, etc.


In [53]:
a = set("sp")
b = set("am")
print(a, b)


{'p', 's'} {'a', 'm'}

In [54]:
c = set(["a","m"])
c == b


Out[54]:
True

In [55]:
"p" in a


Out[55]:
True

In [56]:
a | b


Out[56]:
{'a', 'm', 'p', 's'}

Dictionaries

Dictionaries are one-to-one mappings of objects.

We'll show four ways to make a Dictionary


In [57]:
# number 1... curly braces & colons
d = {"favorite cat": None,
     "favorite spam": "all"}
d


Out[57]:
{'favorite spam': 'all', 'favorite cat': None}

In [58]:
# number 2
d = dict(one = 1, two=2, cat='dog')
d


Out[58]:
{'cat': 'dog', 'two': 2, 'one': 1}

In [59]:
# number 3 ... just start filling in items/keys
d = {}  # empty dictionary
d['cat'] = 'dog'
d['one'] = 1
d['two'] = 2
d


Out[59]:
{'cat': 'dog', 'two': 2, 'one': 1}

In [60]:
# number 4... start with a list of tuples
mylist = [("cat","dog"), ("one",1), ("two",2)]
dict(mylist)


Out[60]:
{'cat': 'dog', 'two': 2, 'one': 1}

In [61]:
dict(mylist) == d


Out[61]:
True

Dictionaries can be complicated (in a good way)

Note that there is no guaranteed order in a dictionary!


In [62]:
d = {"favorite cat": None, "favorite spam": "all"}

In [63]:
d[0]  # this breaks!  Dictionaries have no order


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-63-d1fc1eca4ebb> in <module>()
----> 1 d[0]  # this breaks!  Dictionaries have no order

KeyError: 0

In [64]:
d["favorite spam"]


Out[64]:
'all'

In [65]:
d[0] = "this is a zero"
d


Out[65]:
{0: 'this is a zero', 'favorite spam': 'all', 'favorite cat': None}

Dictionaries can contain dictionaries!


In [66]:
d = {'favorites': {'cat': None, 'spam': 'all'},\
     'least favorite': {'cat': 'all', 'spam': None}}
d['least favorite']['cat']


Out[66]:
'all'

remember: the backslash ('\') allows you to across break lines. Not technically needed when defining a dictionary or list

Dictionaries are used everywhere within Python...


In [67]:
# globals() and locals() store all global and local variables
globals().keys()


Out[67]:
dict_keys(['_i62', 'b', 'mylist', 'd', '_i56', 'L', '_i10', '__builtins__', '_i47', '_i11', '_i25', '_i', '_ih', '_ii', '__doc__', '_25', 'v', '_i22', 'x', '_57', '_18', '_i13', '_i29', '_i28', '_50', '_51', 'print_function', '_29', '_28', '_i18', '_i19', '__', '_i14', '_i15', '_i16', '_i17', '_27', '_26', '_i12', '_24', 'vv', '___', '_58', 'i', '_i64', '_8', '_i58', '_i8', '_i9', 'display', 'determinant', '_i2', '_i3', '_i1', '_i6', '_i7', '_i4', '_i5', '_i50', '_i51', '_i52', '_i53', '_i54', '_i55', 't', '_i57', '_i26', '_i59', '_38', '_55', '_i24', '_30', '_31', '_32', '_33', '_13', '_i27', '_36', '_37', '_dh', '_11', '_i21', '_i66', '_16', '_i20', 'get_ipython', 'a', 'c', '_14', '__builtin__', 'HTML', '_34', 'Out', '_i44', '_35', '_i43', '_i42', '_i41', '_i40', '_oh', 'total', '_i45', 'w', 'y', '_i49', '_i48', '_54', '_i38', '_i39', '_i61', '_i32', '_i33', '_i30', '_i31', '_i36', '_i37', '__name__', '__nonzero__', '_i46', '_i65', '_i23', '_i67', '_44', '_52', 'val', '_60', '_', 'quit', '_i60', '_49', '_48', 'In', '_6', '_i34', '_59', '_64', '_iii', '_i63', '_17', '_66', '_65', 'exit', '_sh', '_23', '__warningregistry__', '_i35', '_56', '_22', '_61'])

List Comprehensions

A pythonic way of creating lists on-the-fly

Example: imagine you want a list of all numbers from 0 to 100 which are divisible by 7 or 11


In [68]:
L = []
for num in range(100):
    if (num % 7 == 0) or (num % 11 == 0):
        L.append(num)
print(L)


[0, 7, 11, 14, 21, 22, 28, 33, 35, 42, 44, 49, 55, 56, 63, 66, 70, 77, 84, 88, 91, 98, 99]

We can also do this with a list comprehension:


In [69]:
L = [num for num in range(100)\
     if (num % 7 == 0) or (num % 11 == 0)]
print(L)


[0, 7, 11, 14, 21, 22, 28, 33, 35, 42, 44, 49, 55, 56, 63, 66, 70, 77, 84, 88, 91, 98, 99]

In [70]:
# Can also operate on each element:
L = [2 * num for num in range(100)\
     if (num % 7 == 0) or (num % 11 == 0)]
print(L)


[0, 14, 22, 28, 42, 44, 56, 66, 70, 84, 88, 98, 110, 112, 126, 132, 140, 154, 168, 176, 182, 196, 198]

Example: Below is a list of information on 50 of the largest near-earth asteroids. Given this list of asteroid information, let's find all asteroids with semi-major axis within 0.2AU of earth, and with eccentricities less than 0.5


In [71]:
# Each element is (name, semi-major axis (AU), eccentricity, orbit class)
# source: http://ssd.jpl.nasa.gov/sbdb_query.cgi

Asteroids = [('Eros', 1.457916888347732, 0.2226769029627053, 'AMO'),
             ('Albert', 2.629584157344544, 0.551788195302116, 'AMO'),
             ('Alinda', 2.477642943521562, 0.5675993715753302, 'AMO'),
             ('Ganymed', 2.662242764279804, 0.5339300994578989, 'AMO'),
             ('Amor', 1.918987277620309, 0.4354863345648127, 'AMO'),
             ('Icarus', 1.077941311539208, 0.826950446001521, 'APO'),
             ('Betulia', 2.196489260519891, 0.4876246891992282, 'AMO'),
             ('Geographos', 1.245477192797457, 0.3355407124897842, 'APO'),
             ('Ivar', 1.862724540418448, 0.3968541470639658, 'AMO'),
             ('Toro', 1.367247622946547, 0.4358829575017499, 'APO'),
             ('Apollo', 1.470694262588244, 0.5598306817483757, 'APO'),
             ('Antinous', 2.258479598510079, 0.6070051516585434, 'APO'),
             ('Daedalus', 1.460912865705988, 0.6144629118218898, 'APO'),
             ('Cerberus', 1.079965807367047, 0.4668134997419173, 'APO'),
             ('Sisyphus', 1.893726635847921, 0.5383319204425762, 'APO'),
             ('Quetzalcoatl', 2.544270656955212, 0.5704591861565643, 'AMO'),
             ('Boreas', 2.271958775354725, 0.4499332278634067, 'AMO'),
             ('Cuyo', 2.150453953345012, 0.5041719257675564, 'AMO'),
             ('Anteros', 1.430262719980132, 0.2558054402785934, 'AMO'),
             ('Tezcatlipoca', 1.709753263222791, 0.3647772103513082, 'AMO'),
             ('Midas', 1.775954494579457, 0.6503697243919138, 'APO'),
             ('Baboquivari', 2.646202507670927, 0.5295611095751231, 'AMO'),
             ('Anza', 2.26415089613359, 0.5371603112900858, 'AMO'),
             ('Aten', 0.9668828078092987, 0.1827831025175614, 'ATE'),
             ('Bacchus', 1.078135348117527, 0.3495569270441645, 'APO'),
             ('Ra-Shalom', 0.8320425524852308, 0.4364726062545577, 'ATE'),
             ('Adonis', 1.874315684524321, 0.763949321566, 'APO'),
             ('Tantalus', 1.289997492877751, 0.2990853014998932, 'APO'),
             ('Aristaeus', 1.599511990737142, 0.5030618532252225, 'APO'),
             ('Oljato', 2.172056090036035, 0.7125729402616418, 'APO'),
             ('Pele', 2.291471988746353, 0.5115484924883255, 'AMO'),
             ('Hephaistos', 2.159619960333728, 0.8374146846143349, 'APO'),
             ('Orthos', 2.404988778495748, 0.6569133796135244, 'APO'),
             ('Hathor', 0.8442121506103012, 0.4498204013480316, 'ATE'),
             ('Beltrovata', 2.104690977122337, 0.413731105995413, 'AMO'),
             ('Seneca', 2.516402574514213, 0.5708728441169761, 'AMO'),
             ('Krok', 2.152545170235639, 0.4478259793515817, 'AMO'),
             ('Eger', 1.404478323548423, 0.3542971360331806, 'APO'),
             ('Florence', 1.768227407864309, 0.4227761019048867, 'AMO'),
             ('Nefertiti', 1.574493139339916, 0.283902719273878, 'AMO'),
             ('Phaethon', 1.271195939723604, 0.8898716672181355, 'APO'),
             ('Ul', 2.102493486378346, 0.3951143067760007, 'AMO'),
             ('Seleucus', 2.033331705805067, 0.4559159977082651, 'AMO'),
             ('McAuliffe', 1.878722427225527, 0.3691521497610656, 'AMO'),
             ('Syrinx', 2.469752836845105, 0.7441934504192601, 'APO'),
             ('Orpheus', 1.209727780883745, 0.3229034563257626, 'APO'),
             ('Khufu', 0.989473784873371, 0.468479627898914, 'ATE'),
             ('Verenia', 2.093231870619781, 0.4865133359612604, 'AMO'),
             ('"Don Quixote"', 4.221712367193639, 0.7130894892477316, 'AMO'),
             ('Mera', 1.644476057737928, 0.3201425983025733, 'AMO')]

orbit_class = {'AMO':'Amor', 'APO':'Apollo', 'ATE':'Aten'}

In [72]:
# first we'll build the list using loops.
L = []
for data in Asteroids:
    name, a, e, t = data
    if abs(a - 1) < 0.2 and e < 0.5:
        L.append(name)
print(L)


['Cerberus', 'Aten', 'Bacchus', 'Ra-Shalom', 'Hathor', 'Khufu']

In [73]:
# now with a list comprehension...
L = [name for (name, a, e, t) in Asteroids
     if abs(a - 1) < 0.2 and e < 0.5]
print(L)


['Cerberus', 'Aten', 'Bacchus', 'Ra-Shalom', 'Hathor', 'Khufu']

Here is how we could create a dictionary from the list


In [74]:
D = dict([(name, (a, e, t)) for (name, a, e, t) in Asteroids])
print(D['Eros'])
print(D['Amor'])


(1.457916888347732, 0.2226769029627053, 'AMO')
(1.918987277620309, 0.4354863345648127, 'AMO')

Breakout #2: Sorting Asteroids

Using the above Asteroid list,

  • print the list sorted in alphabetical order by asteroid name (hint: how does sorting handle a list of tuples?)
  • print the list sorted by semi-major axis
  • print the list sorted by name, but replace the class code with the class name

The output should be formatted like this:

Asteroid name    a (AU)    e        class
-----------------------------------------
Eros             1.4578    0.2226   Amor
Albert           2.6292    0.5518   Amor
.
.
.

Bonus points if you can get the columns to line up nicely!