Sets and Dictionaries

Table of Contents

  • Sets

  • Estimated Time Needed: 20 min

    Sets

    In this lab, we are going to take a look at sets in Python. A set is a unique collection of objects in Python. You can denote a set with a curly bracket {}. Python will remove duplicate items:

    
    
    In [1]:
    set1={"pop", "rock", "soul", "hard rock", "rock", "R&B", "rock", "disco"}
    set1
    
    
    
    
    Out[1]:
    {'R&B', 'disco', 'hard rock', 'pop', 'rock', 'soul'}

    The process of mapping is illustrated in the figure:

    You can also create a set from a list as follows:

    
    
    In [2]:
    album_list =[ "Michael Jackson", "Thriller", 1982, "00:42:19", \
                  "Pop, Rock, R&B", 46.0, 65, "30-Nov-82", None, 10.0]
    
    album_set = set(album_list)             
    album_set
    
    
    
    
    Out[2]:
    {65,
     '00:42:19',
     None,
     10.0,
     46.0,
     '30-Nov-82',
     'Michael Jackson',
     'Thriller',
     'Pop, Rock, R&B',
     1982}

    Now let us create a set of genres:

    
    
    In [3]:
    music_genres = set(["pop", "pop", "rock", "folk rock", "hard rock", "soul", \
                        "progressive rock", "soft rock", "R&B", "disco"])
    music_genres
    
    
    
    
    Out[3]:
    {'R&B',
     'disco',
     'folk rock',
     'hard rock',
     'pop',
     'progressive rock',
     'rock',
     'soft rock',
     'soul'}

    Convert the following list to a set ['rap','house','electronic music', 'rap']:

    
    
    In [4]:
    set(['rap','house','electronic music', 'rap'])
    
    
    
    
    Out[4]:
    {'electronic music', 'house', 'rap'}
    ``` set(['rap','house','electronic music','rap']) ```

    Notice that the duplicates are removed and the output is sorted.

    Let us get the sum of the claimed sales:

    Consider the list A=[1,2,2,1] and set B=set([1,2,2,1]), does sum(A)=sum(B)

    
    
    In [7]:
    A=[1,2,2,1]
    B=set([1,2,2,1])
    print(sum(A)==sum(B))
    print(sum(A))
    print(sum(B))
    
    
    
    
    False
    6
    3
    
    ``` No, when casting a list to a set, the new set has no repeat elements. Run the following code to verify: A=[1,2,2,1] B=set([1,2,2,1]) print("the sum of A is:",sum(A)) print("the sum of B is:",sum(B)) ```

    Now let's determine the average rating:

    Set Operations

    Let us go over Set Operations, as these can be used to change the set. Consider the set A:

    
    
    In [8]:
    A = set(["Thriller","Back in Black", "AC/DC"] )
    A
    
    
    
    
    Out[8]:
    {'AC/DC', 'Back in Black', 'Thriller'}

    We can add an element to a set using the add() method:

    
    
    In [9]:
    A.add("NSYNC")
    A
    
    
    
    
    Out[9]:
    {'AC/DC', 'Back in Black', 'NSYNC', 'Thriller'}

    If we add the same element twice, nothing will happen as there can be no duplicates in a set:

    
    
    In [10]:
    A.add("NSYNC")
    A
    
    
    
    
    Out[10]:
    {'AC/DC', 'Back in Black', 'NSYNC', 'Thriller'}

    We can remove an item from a set using the remove method:

    
    
    In [11]:
    A.remove("NSYNC")
    A
    
    
    
    
    Out[11]:
    {'AC/DC', 'Back in Black', 'Thriller'}

    We can verify if an element is in the set using the in command :

    
    
    In [12]:
    "AC/DC"  in A
    
    
    
    
    Out[12]:
    True

    Working with sets

    Remember that with sets you can check the difference between sets, as well as the symmetric difference, intersection, and union:

    Consider the following two sets:

    
    
    In [13]:
    album_set1 = set(["Thriller",'AC/DC', 'Back in Black'] )
    album_set2 = set([ "AC/DC","Back in Black", "The Dark Side of the Moon"] )
    

    Visualizing the sets as two circles

    </h4>

    
    
    In [14]:
    album_set1, album_set2
    
    
    
    
    Out[14]:
    ({'AC/DC', 'Back in Black', 'Thriller'},
     {'AC/DC', 'Back in Black', 'The Dark Side of the Moon'})

    As both sets contain 'AC/DC' and 'Back in Black' we represent these common elements with the intersection of two circles.

    Visualizing common elements with the intersection of two circles.

    </h4>

    We can find the common elements of the sets as follows:

    
    
    In [15]:
    album_set_3=album_set1 & album_set2
    album_set_3
    
    
    
    
    Out[15]:
    {'AC/DC', 'Back in Black'}

    We can find all the elements that are only contained in album_set1 using the difference method:

    
    
    In [16]:
    album_set1.difference(album_set2)
    
    
    
    
    Out[16]:
    {'Thriller'}

    We only consider elements in album_set1; all the elements in album_set2, including the intersection, are not included.

    The difference of “album_set1” and “album_set2

    </h4>

    The difference between album_set2 and album_set1 is given by:

    
    
    In [17]:
    album_set2.difference(album_set1)
    
    
    
    
    Out[17]:
    {'The Dark Side of the Moon'}

    The difference of album_set2 and album_set1

    </h4>

    We can also find the intersection, i.e in both album_list2 and album_list1, using the intersection command :

    
    
    In [18]:
    album_set1.intersection(album_set2)
    
    
    
    
    Out[18]:
    {'AC/DC', 'Back in Black'}

    This corresponds to the intersection of the two circles:

    Intersection of set

    </h4>

    The union corresponds to all the elements in both sets, which is represented by colouring both circles:

    Figure 7: Union of set

    </h4>

    The union is given by:

    
    
    In [19]:
    album_set1.union(album_set2)
    
    
    
    
    Out[19]:
    {'AC/DC', 'Back in Black', 'The Dark Side of the Moon', 'Thriller'}

    And you can check if a set is a superset or subset of another set, respectively, like this:

    
    
    In [20]:
    set(album_set1).issuperset(album_set2)
    
    
    
    
    Out[20]:
    False
    
    
    In [21]:
    set(album_set2).issubset(album_set1)
    
    
    
    
    Out[21]:
    False

    Here is an example where issubset() is issuperset() is true:

    
    
    In [22]:
    set({"Back in Black", "AC/DC"}).issubset(album_set1)
    
    
    
    
    Out[22]:
    True
    
    
    In [23]:
    album_set1.issuperset({"Back in Black", "AC/DC"})
    
    
    
    
    Out[23]:
    True

    Create a new set “album_set3” that is the union of “album_set1” and “album_set2”:

    
    
    In [25]:
    album_set3 = album_set1.union(album_set2)
    album_set3
    
    
    
    
    Out[25]:
    {'AC/DC', 'Back in Black', 'The Dark Side of the Moon', 'Thriller'}
    ``` album_set3=album_set1.union(album_set2) album_set3 ```

    Find out if "album_set1" is a subset of "album_set3":

    
    
    In [26]:
    album_set1.issubset(album_set3)
    
    
    
    
    Out[26]:
    True
    ``` album_set1.issubset(album_set3) ```

    About the Authors:

    Joseph Santarcangelo has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.


    Copyright © 2017 cognitiveclass.ai. This notebook and its source code are released under the terms of the MIT License.