Collections of Items: Lists, Sets, Tuples, Dictionaries

Lists

Python has many ways to store a collection of similar or dissimilar items.

You have already encountered lists, even though you haven't been formally introduced.

Final Problem


In [ ]:
final = "It is with a heavy heart that I take up my pen to write these the last words in which I shall ever record the singular gifts by which my friend Mr. Sherlock Holmes was distinguished."

In [ ]:
final = final.replace(".", "")
final = final.split(" ")
final

In [ ]:
type(final)

Lists are a great way to store items of multiple types.


In [ ]:
all_info = ["Sherlock", True, 42]
print(all_info)
print(type(all_info))

In [ ]:
more_info = ["Watson", "Hooley", all_info]
print(more_info)
print(type(more_info))

Indexing and Slicing : How to access parts of a list


In [ ]:
all_info = ["Sherlock", True, 42]
all_info[0]

In [ ]:
all_info[1]

In [ ]:
more_info = ["Watson", "Hooley", all_info]
more_info[-1]

In [ ]:
more_info[-1][0]

In [ ]:
a_list = [1,2,3,4,5,6,7,8,9,10]
a_list[0]

In [ ]:
a_list[2:4]

In [ ]:
a_list[1:]

In [ ]:
a_list[-3:]

In [ ]:
a_list[3:6]

In [ ]:
a_list[0:3]

Exercise

Print out the second and last items of the list within the list more_info

  • all_info = ["Sherlock", True, 42]
  • more_info = ["Watson", "Hooley", all_info]

In [ ]:
# Your code here

List Functions

These come in handy when working with larger data sets where you need to add to your existing information base.


In [ ]:
incubees = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]

In [ ]:
type(incubees)

In [ ]:
incubees.append("Jian Yang")
incubees

In [ ]:
brogrammers = ["Aly", "Jason"]
incubees.append(brogrammers)
incubees

In [ ]:
incubees = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]

In [ ]:
incubees.extend(brogrammers)

In [ ]:
incubees

Let's look at this again, comparing the two approaches.


In [ ]:
incubees1 = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]
brogrammers1 = ["Aly", "Jason"]
incubees1.append(brogrammers1)
print(incubees1)
print(len(incubees1))

In [ ]:
incubees2 = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]
brogrammers2 = ["Aly", "Jason"]
incubees2.extend(brogrammers2)
print(incubees2)
print(len(incubees2))

In [ ]:
incubees = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]
incubees.sort()
print(incubees)

In [ ]:
incubees.reverse()
print(incubees)

In [ ]:
incubees = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]
brogrammers = ["Aly", "Jason"]

print(incubees + brogrammers)

In [ ]:
new_list = incubees + brogrammers
print (new_list*2)

As a general rule though, if you're adding a single item, use append. Try what happens if you use the .extend method to add a single item, like "Jian Yang" to our list of original Incubees.

Finding Items


In [ ]:
incubees = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]
incubees.index("Richard")

In [ ]:
incubees.index("Dinesh")

In [ ]:
incubees.index("Jian Yang")

In [ ]:
incubees2 = incubees*2

In [ ]:
incubees2.count("Richard")

In [ ]:
incubees = ["Richard", "Gilfoyle", "Dinesh", "Nelson"]
incubees

In [ ]:
incubees.insert(0, "Jian Yang")

In [ ]:
incubees

In [ ]:
incubees.pop(0)
incubees

Exercise

  • Convert the passage below into a list after replacing all punctuation
  • Who is mentioned more times - Sherlock or Moriarty?
  • How many times does the author refer to himself? (Hint: Count the use of the word 'my')

In [ ]:
passage = """It is with a heavy heart that I take up my pen to write these the last words in which I shall ever record the singular gifts by which my friend Mr. Sherlock Holmes was distinguished. In an incoherent and, as I deeply feel, an entirely inadequate fashion, I have endeavored to give some account of my strange experiences in his company from the chance which first brought us together at the period of the “Study in Scarlet,” up to the time of his interference in the matter of the “Naval Treaty”—an interference which had the unquestionable effect of preventing a serious international complication. It was my intention to have stopped there, and to have said nothing of that event which has created a void in my life which the lapse of two years has done little to fill. My hand has been forced, however, by the recent letters in which Colonel James Moriarty defends the memory of his brother, and I have no choice but to lay the facts before the public exactly as they occurred. I alone know the absolute truth of the matter, and I am satisfied that the time has come when no good purpose is to be served by its suppression. As far as I know, there have been only three accounts in the public press: that in the Journal de Geneve on May 6th, 1891, the Reuter’s despatch in the English papers on May 7th, and finally the recent letter to which I have alluded. Of these the first and second were extremely condensed, while the last is, as I shall now show, an absolute perversion of the facts. It lies with me to tell for the first time what really took place between Professor Moriarty and Mr. Sherlock Holmes.
It may be remembered that after my marriage, and my subsequent start in private practice, the very intimate relations which had existed between Holmes and myself became to some extent modified. He still came to me from time to time when he desired a companion in his investigation, but these occasions grew more and more seldom, until I find that in the year 1890 there were only three cases of which I retain any record. During the winter of that year and the early spring of 1891, I saw in the papers that he had been engaged by the French government upon a matter of supreme importance, and I received two notes from Holmes, dated from Narbonne and from Nimes, from which I gathered that his stay in France was likely to be a long one. It was with some surprise, therefore, that I saw him walk into my consulting-room upon the evening of April 24th. It struck me that he was looking even paler and thinner than usual."""

In [ ]:
# Your code below

Numerical Functions with Lists

Lists are useful for more than just processing text.


In [ ]:
list_a = [21,14,7,19,15,47,42,55,97,92]

In [ ]:
len(list_a)

In [ ]:
sum(list_a)

In [ ]:
max(list_a)

In [ ]:
min(list_a)

In [ ]:
range = max(list_a) - min(list_a)
print(range)

Exercise

list_a = [21,14,7,19,15,47,42,55,97,92]

  • Find the average of list_a
  • Find the median of list_a

If you need a refresher on how to find the average (or mean) and median, we will cover that later.


In [ ]:
# Your code below:

Exercise

Let's combine lists, string methods and a bit of logic.

  • Strip the passage of all punctuation
  • How many times does the word 'Titanic' appear?
  • How many times does 'Carpathia' appear?
  • Slightly trickier question - how many words does each paragraph have? (Hint: Split the passage at "\n", then count the words for each paragraph)

In [ ]:
titanic = """CAPE RACE, N.F., April 15. -- The White Star liner Olympic reports by wireless this evening that the Cunarder Carpathia reached, at daybreak this morning, the position from which wireless calls for help were sent out last night by the Titanic after her collision with an iceberg. The Carpathia found only the lifeboats and the wreckage of what had been the biggest steamship afloat.

The Titanic had foundered at about 2:20 A.M., in latitude 41:46 north and longitude 50:14 west. This is about 30 minutes of latitude, or about 34 miles, due south of the position at which she struck the iceberg. All her boats are accounted for and about 655 souls have been saved of the crew and passengers, most of the latter presumably women and children. There were about 1,200 persons aboard the Titanic.

The Leyland liner California is remaining and searching the position of the disaster, while the Carpathia is returning to New York with the survivors.

It can be positively stated that up to 11 o'clock to-night nothing whatever had been received at or heard by the Marconi station here to the effect that the Parisian, Virginian or any other ships had picked up any survivors, other than those picked up by the Carpathia.

First News of the Disaster.

The first news of the disaster to the Titanic was received by the Marconi wireless station here at 10:25 o'clock last night (as told in yesterday's New York Times.) The Titanic was first heard giving the distress signal "C. Q. D.," which was answered by a number of ships, including the Carpathia, the Baltic and the Olympic. The Titanic said she had struck an iceberg and was in immediate need of assistance, giving her position as latitude 41:46 north and longitude 50:14 west.

At 10:55 o'clock the Titanic reported she was sinking by the head, and at 11:25 o'clock the station here established communication with the Allan liner Virginian, from Halifax to Liverpool, and notified her of the Titanic's urgent need of assistance and gave her the Titanic's position.

The Virginian advised the Marconi station almost immediately that she was proceeding toward the scene of the disaster.

At 11:36 o'clock the Titanic informed the Olympic that they were putting the women off in boats and instructed the Olympic to have her boats read to transfer the passangers.

The Titanic, during all this time, continued to give distress signals and to announce her position.

The wireless operator seemed absolutely cool and clear-headed, his sending throughout being steady and perfectly formed, and the judgment used by him was of the best.

The last signals heard from the Titanic were received at 12:27 A.M., when the Virginian reported having heard a few blurred signals which ended abruptly."""

In [ ]:
# Your code here

Sets

Here's how you make a set.


In [ ]:
set_a = {1,2,3,4,5}
print(set_a)

That's it. As simple as that.

So why do we have sets, as opposed to just using lists? Sets are really fast when it comes to checking for membership. Here's how:


In [ ]:
set_a = {1,2,3,4,5}
5 in set_a

In [ ]:
6 in set_a

But wait, there's more!

  • set_a.add(x): add a value to a set
  • set_a.remove(x): remove a value from a set
  • set_a - set_b: return values in a but not in b.
  • set_a.difference(set_b): same as set_a - set_b
  • set_a | set_b: elements in a or b. Equivalent to set_a.union(set_b)
  • set_a & set_b: elements in both a and b. Equivalent to set_a.intersection(set_b)
  • set_a ^ set_b: elements in a or b but not both. Equivalent to set_a.symmetric_difference(set_b)
  • set_a <= set_b: tests whether every element in set_a is in set_b. Equivalent to set_a.issubset(set_b)

In [ ]:
set_b = {1,2,3}
print(set_a - set_b)
print(set_a.difference(set_b))

Exercise

An analyst is looking at two portfolios, and wants to identify the unique ones.

  • pf1 = {"AA", "AAC", "AAP", "ABB", "AC", "ACCO", "AAPL", "AZO", "ZEN", "PX", "GS"}
  • pf2 = {"AA", "GRUB", "AAC", "GWR", "AAP", "C", "AC", "CVS"}

Write code for the following:

  • Find the stocks in either pf1 or pf2, but not in both. (Hint: Symmetric Difference)
  • Find the stocks in both portfolios (Hint: Intersection)
  • Create a third portfolio named pf3, which has pf1 and pf2 combined (Hint: Union)
  • Market conditions have changed, let's drop GRUB and CVS from pf3 and add IBM (Hint: set_a.remove(x) and set_a.add(x) )

In [ ]:
pf1 = {"AA", "AAC", "AAP", "ABB", "AC", "ACCO", "AAPL", "AZO", "ZEN", "PX", "GS"}
pf2 = {"AA", "GRUB", "AAC", "GWR", "AAP", "C", "AC", "CVS"}

In [ ]:
# Find the stocks in either pf1 or pf2, but not in both.

In [ ]:
# Find the stocks in both portfolios

In [ ]:
# Create a third portfolio named pf3, which has pf1 and pf2 combined

In [ ]:
# Market conditions have changed, let's drop GRUB and CVS from pf3 and add IBM

Tuples

Pronounced too-puhl

We will keep this section very short in this section, but will revisit this later once we have introduced some more advanced concepts.

For now, remember that a tuple is used when the values are fixed. In Python terms, it is what is referred to as 'immutable'.

Examples:


In [ ]:
children = ("Meadow", "Anthony")
capos = ("Paulie", "Silvio", "Christopher", "Furio","Richie")

In [ ]:
len(children)

In [ ]:
len(capos)

In [ ]:
capos

In [ ]:
capos = list(capos)

In [ ]:
capos

In [ ]:
capos.append("Bobby")

In [ ]:
capos

In [ ]:
capos = tuple(capos)

In [ ]:
capos

Tuples and Numbers


In [ ]:
monthly_high = (115.20, 113.60, 117.15, 120.90, 118.25)

In [ ]:
print("Monthly high is", max(monthly_high))
print("Monthly low is", min(monthly_high))
print("Range:", max(monthly_high)-min(monthly_high))

Dictionaries

Dictionaries contain a key and a value. They are also referred to as dicts, maps, or hashes.


In [ ]:
dict_1 = {"a":1, "b":2, "c":3, "d":4}
print(dict_1)

In [ ]:
fav_book = {
    "title": "Crime and Punishment",
    "author": "Fyodor Dostoyevsky",
    "price": 10.95,
    "pages": 400,
    "source": "Amazon",
    "awesome": True
}

In [ ]:
fav_book["title"]

In [ ]:
fav_book["awesome"]

In [ ]:
# Rarely used in this manner
fav_book.get("price")

In [ ]:
fav_book["weight"] = 42
print(fav_book)

In [ ]:
"awesome" in fav_book

In [ ]:
# Doesn't work!
True in fav_book

Common Dictionary Operations


In [ ]:
dict_1 = {"a":1, "b":2, "c":3, "d":4}
print(dict_1)

In [ ]:
dict_1.keys()

In [ ]:
dict_1.values()

In [ ]:
dict_1.pop("d")
print(dict_1)

In [ ]:
fav_book.pop("awesome")
fav_book

Exercise

Find matching key between the two dictionaries.


In [ ]:
a_dict = {"a":"e", "b":5, "c":3, "c": 4}
b_dict = {"c":5, "d":6}

In [ ]:
a_set = set(a_dict)
b_set = set(b_dict)

In [ ]:
a_set.intersection(b_set)

Over the next two videos, we will cover loops and functions - two really powerful concepts to supercharge your programming.