Today we will be learning about
Recap from Week 3
day = input('Enter the day of the month you were born')
month = input('Enter the month of the year you were born')
year = input('Enter the year you were born')
birthday = '{} / {} / {}'.format(day,month,year)
print('Your birthday is \n{}'.format(birthday))
file = open('test_data.txt','r')
for line in file:
print(line)
list1 = [1,2,3,4,5,6]
for item in reversed(list1):
print(item)
squares = [x**2 for x in range(1,11)]
We met lists in lesson 3 and are briefly going to go over them again to get in the mindset of looking at data structures in Python. A data structure is a type of object in Python which stores information in some organised format. The format of this organisation dictates what the data structure is. There are buckets of different kinds of data structures, but when writing Python you will primarily be using lists, dictionaries and tuples.
In [ ]:
# Remember we declare an empty list as so
my_list = []
# Add new elements to the end of a previous list
my_list.append(1)
my_list.append('Hello')
my_list.append(0.05)
# Delete specific elements
del my_list[-1] # Remove by index
my_list.remove('Hello') # Remove by value
# Replace elements
my_list[0] = 2
In [ ]:
# Or we can declare a whole, or part of, a list upon declaration
shopping_list = ['bread','toothpaste','blueberries','milk']
# Printing each element of a list
for item in shopping_list:
print(item)
# Find the length of a list
print('The shopping list has {} items'.format(len(shopping_list)))
In [ ]:
# We can test for membership in a list in the same fashion as a string
if 'milk' in shopping_list:
print('You can\'t drink milk!')
shopping_list.remove('milk')
if 'chocolate' not in shopping_list:
print('You forgot chocolate!')
shopping_list.append('chocolate')
print(shopping_list)
Lists always have their order preserved in Python, so you can guarantee that shopping_list[0] will have the value "bread"
A tuple is another of the standard Python data strucure. They behave in a similar way to the list but have one key difference, they are immutable. Let's look at what this means.
A more detailed intro to Tuples can be found here
In [ ]:
# A tuple is declared with the curved brackets () instead of the [] for a list
my_tuple = (1,2,'cat','dog')
# But since a tuple is immutable the next line will not run
my_tuple[0] = 4
So what can we learn from this? Once you declare a tuple, the object cannot be changed.
For this reason, tuples have more optimised methods when you use them so can be more efficient and faster in your code.
In [ ]:
# A tuple might be immutable but can contain mutable objects
my_list_tuple = ([1,2,3],[4,5,6])
# This won't work
# my_list_tuple[0] = [3,2,1]
# But this will!
my_list_tuple[0][0:3] = [3,2,1]
print(my_list_tuple)
In [ ]:
# You can add tuples together
t1 = (1,2,3)
t1 += (4,5,6)
print(t1)
t2 = (10,20,30)
t3 = (40,50,60)
print(t2+t3)
In [ ]:
# Use index() and count() to look at a tuple
t1 = (1,2,3,1,1,2)
print(t1.index(2)) # Returns the first index of 2
print(t1.count(1)) # Returns how many 1's are in the tuple
In [ ]:
# You can use tuples for multiple assignments and for multiple return from functions
(x,y,z) = (1,2,3)
print(x)
# This is a basic function doing multiple return in Python
def norm_and_square(a):
return a,a**2
(a,b) = norm_and_square(4)
print(a)
print(b)
In [ ]:
# Swap items using tuples
x = 10
y = 20
print('x is {} and y is {}'.format(x,y))
(x,y) = (y,x)
print('x is {} and y is {}'.format(x,y))
In [ ]:
# TO DO
def my_swap_function(a,b):
# write here!
return b,a
# END TO DO
In [ ]:
a = 1
b = 2
x = my_swap_function(a,b)
print(x)
Dictionaries are perhaps the most useful and hardest to grasp data structure from the basic set in Python. Dictionaries are not iterable in the same sense as lists and tuples and using them required a different approach.
Dictionaries are sometimes called hash maps, hash tables or maps in other programming languages. You can think of a dictionary as the same as a physical dictionary, it is a collection of key (the word) and value (the definition) pairs.
Each key is unique and has an associated value, the key functions as the index for the value but it can be anything. In contrast to alphabetical dictionaries, the order of a Python dictionary is not guaranteed.
In [ ]:
# Declare a dictionary using the {} brackets or the dict() method
my_dict = {}
# Add new items to the dictionary by stating the key as the index and the value
my_dict['bananas'] = 'this is a fruit and a berry'
my_dict['apples'] = 'this is a fruit'
my_dict['avocados'] = 'this is a berry'
print(my_dict)
In [ ]:
# So now we can use the key to get a value in the dictionary
print(my_dict['bananas'])
# But this won't work if we haven't added an item to the dict
#print(my_dict['cherries'])
# We can fix this line using the get(key,def) method. This is safer as you wont get KeyError!
print(my_dict.get('cherries','Not found :('))
In [ ]:
# If you are given a dictionary data file you know nothing about you can inspect it like so
# Get all the keys of a dictionary
print(my_dict.keys())
# Get all the values from a dictionary
print(my_dict.values())
# Of course you could print the whole dictionary, but it might be huge! These methods break
# the dict down, but the downside is that you can't match up the keys and values!
In [ ]:
# Test for membership in the keys using the in operator
if 'avocados' in my_dict:
print(my_dict['avocados'])
In [ ]:
# Dictionary values can also be lists or other data structures
my_lists = {}
my_lists['shopping list'] = shopping_list
my_lists['holidays'] = ['Munich','Naples','New York','Tokyo','San Francisco','Los Angeles']
# Now my I store a dictionary with each list named with keys and the lists as values
print(my_lists)
Wrapping everything up, we can create a list of dictionaries with multiple fields and iterate over a dictionary
In [ ]:
# Declare a list
europe = []
# Create dicts and add to lists
germany = {"name": "Germany", "population": 81000000,"speak_german":True}
europe.append(germany)
luxembourg = {"name": "Luxembourg", "population": 512000,"speak_german":True}
europe.append(luxembourg)
uk = {"name":"United Kingdom","population":64100000,"speak_german":False}
europe.append(uk)
print(europe)
print()
for country in europe:
for key, value in country.items():
print('{}\t{}'.format(key,value))
print()
In [ ]:
# TO DO - You might need more than just a for loop!
# END TO DO
We've seen some of the standard library of Data structures in Python. We will briefly look at Pandas now, a powerful data manipulation library which is a sensible next step to organising your data when you need to use something more complex than standard Python data structures.
The core of Pandas is the DataFrame, which will look familiar if you have worked with R before. This organises data in a table format and gives you spreadsheet like handling of your information. Using Pandas can make your job handling data easier, and many libraries for plotting data (such as Seaborn) can handle a Pandas DataFrame much easier than a list as input.
Note: Pandas uses NumPy under the hood, another package for simplifying numerical operations and working with arrays. We will look at NumPy and Pandas together in 2 lessons time.
In [ ]:
# We import the Pandas packages using the import statement we've seen before
import pandas as pd
In [ ]:
# To create a Pandas DataFrame from a simpler data structure we use the following routine
europe_df = pd.DataFrame.from_dict(europe)
print(type(europe_df))
In [ ]:
# Running this cell as is provides the fancy formatting of Pandas which can prove useful.
europe_df
Run the previous block now. Here we can see how our list of dictionaries was converted to a DataFrame. Each dictionary became a row, each key became a column and the values became the data inside the object.
That's all on Pandas for now! For a quick tutorial on using Pandas you can check this link out. We'll come back to this in the future, we just have to look at Object Oriented Programming and Classes first!