This is the first notebook in the Learn Pandas track.
These exercises assume some prior experience with Pandas.
Each page has a list of relevant resources that you can use for reference, and the top item in each list has been chosen specifically to help you with the exercises on that page.
The first step in most data science projects is reading in the data.
In this section, you will be using pandas
to create Series
and DataFrame
objects, both by hand and by reading data files.
The Relevant Resources, as promised:
In [1]:
import pandas as pd
pd.set_option('max_rows', 5)
from learntools.advanced_pandas.creating_reading_writing import *
You can check your answers in each of the exercises that follow using the check_qN
function provided in the code cell above by replacing N
with the number of the exercise.
For example here's how you would check an incorrect answer to exercise 1:
In [2]:
check_q1(pd.DataFrame())
Out[2]:
A correct answer would return True
.
If you capitulate, run print(answer_qN()))
.
Exercise 1
Create a DataFrame
:
In [3]:
data = {'Apples': [30], 'Bananas': [21]}
pd.DataFrame(data=data)
Out[3]:
In [4]:
df2 = pd.DataFrame([(30, 21)], columns=['Apples', 'Bananas'])
df2
Out[4]:
In [5]:
answer_q1()
Exercise 2
Create a 2x2 DataFrame:
In [6]:
df2x2 = pd.DataFrame([[35, 21], [41, 34]], index=['2017 Sales', '2018 Sales'], columns=['Apples', 'Bananas'])
df2x2
Out[6]:
In [7]:
answer_q2()
Exercise 3
Create a Series
:
In [8]:
pd.Series({'Flour': '4 cups', 'Milk': '1 cup', 'Eggs': '2 large', 'Spam': '1 can'}, name='Dinner')
Out[8]:
In [9]:
answer_q3()
Exercise 4
Read data from a .csv file into a DataFrame
.
In [10]:
wine_reviews = pd.read_csv('inputs/wine-reviews/winemag-data_first150k.csv', index_col=0)
wine_reviews.head()
Out[10]:
In [11]:
wine_reviews.tail()
Out[11]:
In [12]:
wine_reviews.shape
Out[12]:
In [13]:
wine_reviews.info()
In [14]:
dir(wine_reviews)
Out[14]:
In [15]:
print(wine_reviews)
In [16]:
wine_reviews.items
Out[16]:
In [17]:
answer_q4()
Exercise 5
Read data from a .xls sheet into a pandas DataFrame
.
In [18]:
wic = pd.read_excel('inputs/publicassistance/xls_files_all/WICAgencies2014ytd.xls',
sheetname='Pregnant Women Participating')
wic.info()
In [19]:
wic.head()
Out[19]:
In [20]:
answer_q5()
Exercise 6
Save a DataFrame
as a .csv file.
In [22]:
q6_df = pd.DataFrame({'Cows': [12, 20], 'Goats': [22, 19]}, index=['Year 1', 'Year 2'])
q6_df.to_csv('cows_and_goats.csv')
In [23]:
answer_q6()
Exercise 7
Read SQL data into a DataFrame:
In [ ]: