In [ ]:
l = [1, 2, 3, 2, 3] # list of 5 values
s = set(l) # set of 3 unique values
print(s)
e = set() # empty set
print(e)
Sets are very similar to lists and tuples and you can use many of the same operators and functions, except they are inherently unordered, so they don't have an index, and can only contain unique values, so adding a value already in the set will have no effect
In [ ]:
s = set([1, 2, 3, 2, 3])
print(s)
print("number in set:", len(s))
s.add(4)
print(s)
s.add(3)
print(s)
You can remove specific elements from the set.
In [ ]:
s = set([1, 2, 3, 2, 3])
print(s)
s.remove(3)
print(s)
You can do all the expected logical operations on sets, such as taking the union or intersection of 2 sets with the | or and & and operators
In [ ]:
s1 = set([2, 4, 6, 8, 10])
s2 = set([4, 5, 6, 7])
print("Union:", s1 | s2)
print("Intersection:", s1 & s2)
Lists are useful in many contexts, but often we have some data that has no inherent order and that we want to access by some useful name rather than an index. For example, as a result of some experiment we may have a set of genes and corresponding expression values. We could put the expression values in a list, but then we'd have to remember which index in the list corresponded to which gene and this would quickly get complicated.
For these situations a dictionary is a very useful data structure.
Dictionaries:
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
print(dna)
You can access values in a dictionary using the key inside square brackets
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
print("A represents", dna["A"])
print("G represents", dna["G"])
An error is triggered if a key is absent from the dictionary:
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
print("What about N?", dna["N"])
You can access values safely with the get method, which gives back None if the key is absent and you can also supply a default values
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
print("What about N?", dna.get("N"))
print("With a default value:", dna.get("N", "unknown"))
You can check if a key is in a dictionary with the in operator, and you can negate this with not
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
"T" in dna
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
"Y" not in dna
The len() function gives back the number of (key, value) pairs in the dictionary:
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
print(len(dna))
You can introduce new entries in the dictionary by assigning a value with a new key:
In [ ]:
dna = {"A": "Adenine", "C": "Cytosine", "G": "Guanine", "T": "Thymine"}
dna['Y'] = 'Pyrimidine'
print(dna)
You can change the value for an existing key by reassigning it:
In [ ]:
dna = {'A': 'Adenine', 'C': 'Cytosine', 'T': 'Thymine', 'G': 'Guanine', 'Y': 'Pyrimidine'}
dna['Y'] = 'Cytosine or Thymine'
print(dna)
You can delete entries from the dictionary:
In [ ]:
dna = {'A': 'Adenine', 'C': 'Cytosine', 'T': 'Thymine', 'G': 'Guanine', 'Y': 'Pyrimidine'}
del dna['Y']
print(dna)
You can get a list of all the keys (in arbitrary order) using the inbuilt .keys() function
In [ ]:
dna = {'A': 'Adenine', 'C': 'Cytosine', 'T': 'Thymine', 'G': 'Guanine', 'Y': 'Pyrimidine'}
print(list(dna.keys()))
And equivalently get a list of the values:
In [ ]:
dna = {'A': 'Adenine', 'C': 'Cytosine', 'T': 'Thymine', 'G': 'Guanine', 'Y': 'Pyrimidine'}
print(list(dna.values()))
And a list of tuples containing (key, value) pairs:
In [ ]:
dna = {'A': 'Adenine', 'C': 'Cytosine', 'T': 'Thymine', 'G': 'Guanine', 'Y': 'Pyrimidine'}
print(list(dna.items()))
Go to our next notebook: python_basic_2_intro