Pandas is a Python Data Analysis Library. It allows you to play around with data and perform powerful data analysis.

In this example I will show you how to read data from CSV and Excel files in Pandas. You can then save the read output as in a Pandas dataframe. The sample data used in the below exercise was generated by https://mockaroo.com/.


In [14]:
import pandas as pd

In [15]:
csv_data_df = pd.read_csv('data/MOCK_DATA.csv')

Preview the first 5 lines of the data with .head() to ensure that it loaded.


In [16]:
csv_data_df.head()


Out[16]:
id first_name last_name email gender ip_address
0 1 Ross Ricart rricart0@berkeley.edu Male 217.151.154.186
1 2 Jenn Pizer jpizer1@usnews.com Female 104.123.13.234
2 3 Delainey Sulley dsulley2@xing.com Male 6.101.0.150
3 4 Nessie Feirn nfeirn3@samsung.com Female 97.93.173.170
4 5 Noami Flanner nflanner4@woothemes.com Female 174.228.138.242

You will need to pip install xlrd if you haven't already. In order to import data from Excel.


In [17]:
import xlrd
excel_data_df = pd.read_excel('data/MOCK_DATA.xlsx')

In [18]:
excel_data_df.head()


Out[18]:
id first_name last_name email gender ip_address
0 1 Chloris Antliff cantliff0@shareasale.com Female 131.17.2.171
1 2 Brion Gierok bgierok1@posterous.com Male 245.41.126.3
2 3 Fleur Skells fskells2@creativecommons.org Female 75.0.34.132
3 4 Dora Privost dprivost3@newsvine.com Female 51.202.4.39
4 5 Annabella Hucker ahucker4@typepad.com Female 124.80.181.41

Image Courtesy of jballeis (Own work) CC BY-SA 3.0, via Wikimedia Commons