Sets in Python

This is a companion notebook to the Data Science Solutions book.

According to Python official documentation:

A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.

Learning goals in this notebook.

  • Defining a set.
  • Membership testing in a set.
  • Set operations - union, difference, intersection, symmetric difference.

In [3]:
# Define a simple set using { } curly braces and unique, homogenous, comma separated values
simple_set = {'Red', 'Green', 'Blue'}
simple_set


Out[3]:
{'Blue', 'Green', 'Red'}

In [4]:
# Create a empty set
empty_set = set()
empty_set


Out[4]:
set()

In [7]:
# Create set from a string, which in turn is a list of characters
alphabet_set = set('a quick brown fox jumpled over a lazy dog')
alphabet_set


Out[7]:
{' ',
 'a',
 'b',
 'c',
 'd',
 'e',
 'f',
 'g',
 'i',
 'j',
 'k',
 'l',
 'm',
 'n',
 'o',
 'p',
 'q',
 'r',
 'u',
 'v',
 'w',
 'x',
 'y',
 'z'}

In [1]:
# Define a set using a list, eliminating duplicates in the process
popular_games_list = ['Call of Duty', 'Final Fantasy', 'Battlefield', 'Witcher', 'Final Fantasy', 'Witcher']
popular_games = set(popular_games_list)
popular_games


Out[1]:
{'Battlefield', 'Call of Duty', 'Final Fantasy', 'Witcher'}

In [2]:
# Define a set using short form
owned_games = set(['Destiny', 'Battlefield', 'Fallout'])
owned_games


Out[2]:
{'Battlefield', 'Destiny', 'Fallout'}

In [288]:
# Membership test
print('Fallout' in owned_games)
print('Fallout' in popular_games)


True
False

In [289]:
# Set difference
popular_not_owned = popular_games - owned_games
popular_not_owned


Out[289]:
{'Call of Duty', 'Final Fantasy', 'Witcher'}

In [290]:
# Set intersection
popular_owned = popular_games & owned_games
popular_owned


Out[290]:
{'Battlefield'}

In [291]:
# Set symmetric difference
popular_owned_unique = popular_games ^ owned_games
popular_owned_unique


Out[291]:
{'Call of Duty', 'Destiny', 'Fallout', 'Final Fantasy', 'Witcher'}

In [292]:
# Set union
all_games = popular_games | owned_games
all_games


Out[292]:
{'Battlefield',
 'Call of Duty',
 'Destiny',
 'Fallout',
 'Final Fantasy',
 'Witcher'}