Lab 1. An Introduction to Pandas and Python



In [2]:

    
# The %... is an iPython thing, and is not part of the Python language.
# In this case we're just telling the plotting library to draw things on
# the notebook, instead of on a separate window.
%matplotlib inline 
#this line above prepares IPython notebook for working with matplotlib

# See all the "as ..." contructs? They're just aliasing the package names.
# That way we can call methods like plt.plot() instead of matplotlib.pyplot.plot().

import numpy as np # imports a fast numerical programming library
import scipy as sp #imports stats functions, amongst other things
import matplotlib as mpl # this actually imports matplotlib
import matplotlib.cm as cm #allows us easy access to colormaps
import matplotlib.pyplot as plt #sets up plotting under plt
import pandas as pd #lets us handle data as dataframes
#sets up pandas table display
pd.set_option('display.width', 500)
pd.set_option('display.max_columns', 100)
pd.set_option('display.notebook_repr_html', True)
import seaborn as sns #sets up styles and gives us more plotting options
print "done!"









    



done!

Experimenting with the markdown feature

$$\alpha + \frac{\beta}{\gamma} = \delta$$

hello
list items
sadfadsf

hello
dfsdfsd

hello, i am bold or I am italic and I am a bold black tick!

and miles to go before i sleep.... this is quoted

def func():
    print "hello"

Python depends on packages for most of its functionality; these can be either built-in (such as sys), or third-party (like all the packages below). Either way you need to import the packages you need before using them.

The Notebook

Look up http:/www.google.com Lets eat a burrito. $\alpha = \frac{\beta}{\gamma}$

Longer:

$$\alpha = \frac{\beta}{\gamma}$$

an item
another item
i like items

Pandas

Get Cheatsheet:

from https://drive.google.com/folderview?id=0ByIrJAE4KMTtaGhRcXkxNHhmY2M&usp=sharing

We read in some data from a CSV file. CSV files can be output by any spreadsheet software, and are plain text, so make a great way to share data. This dataset is from Goodreads: i scraped the highest regarded (according to Goodread's proprietary algorithm) books on that site. Ypu'll see how to do such a scraping in the next lab.



In [4]:

    
df=pd.read_csv("all.csv", 
               names=["rating", 'review_count', 'isbn', 'booktype','author_url', 'year', 'genre_urls', 'dir','rating_count', 'name'],
)
df.head() #this just gives the first five rows of data
df.rating = df.rating/max(df.rating)*100
df.head()









    Out[4]:






  
    
      
      rating
      review_count
      isbn
      booktype
      author_url
      year
      genre_urls
      dir
      rating_count
      name
    
  
  
    
      0
      88.0
      136455
      0439023483
      good_reads:book
      https://www.goodreads.com/author/show/153394.S...
      2008
      /genres/young-adult|/genres/science-fiction|/g...
      dir01/2767052-the-hunger-games.html
      2958974
      The Hunger Games (The Hunger Games, #1)
    
    
      1
      88.2
      16648
      0439358078
      good_reads:book
      https://www.goodreads.com/author/show/1077326....
      2003
      /genres/fantasy|/genres/young-adult|/genres/fi...
      dir01/2.Harry_Potter_and_the_Order_of_the_Phoe...
      1284478
      Harry Potter and the Order of the Phoenix (Har...
    
    
      2
      71.2
      85746
      0316015849
      good_reads:book
      https://www.goodreads.com/author/show/941441.S...
      2005
      /genres/young-adult|/genres/fantasy|/genres/ro...
      dir01/41865.Twilight.html
      2579564
      Twilight (Twilight, #1)
    
    
      3
      84.6
      47906
      0061120081
      good_reads:book
      https://www.goodreads.com/author/show/1825.Har...
      1960
      /genres/classics|/genres/fiction|/genres/histo...
      dir01/2657.To_Kill_a_Mockingbird.html
      2078123
      To Kill a Mockingbird
    
    
      4
      84.6
      34772
      0679783261
      good_reads:book
      https://www.goodreads.com/author/show/1265.Jan...
      1813
      /genres/classics|/genres/fiction|/genres/roman...
      dir01/1885.Pride_and_Prejudice.html
      1388992
      Pride and Prejudice

Notice we have a table! A spreadsheet! And it indexed the rows. Pandas (borrowing from R) calls it a DataFrame. Lets see the types of the columns...

df, in python parlance, is an instance of the pd.DataFrame class, created by calling the pd.read_csv function, which cllas the DataFrame constructor inside of it. If you dont understand this sentence, dont worry, it will become clearer later. What you need to take away is that df is a dataframe object, and it has methods, or functions belonging to it, which allow it to do things. For example df.head() is a method that shows the first 5 rows of the dataframe.

The basics



In [9]:

    
df.dtypes #









    Out[9]:





rating          float64
review_count     object
isbn             object
booktype         object
author_url       object
year            float64
genre_urls       object
dir              object
rating_count     object
name             object
dtype: object

The shape of the object is:



In [4]:

    
df.shape









    Out[4]:





(6000, 10)

6000 rows times 10 columns. A spredsheet is a table is a matrix. How can we access members of this tuple (brackets like so:() )



In [5]:

    
df.shape[0], df.shape[1]









    Out[5]:





(6000, 10)

These are the column names.



In [6]:

    
df.columns









    Out[6]:





Index([u'rating', u'review_count', u'isbn', u'booktype', u'author_url', u'year', u'genre_urls', u'dir', u'rating_count', u'name'], dtype='object')

As the diagram above shows, pandas considers a table (dataframe) as a pasting of many "series" together, horizontally.



In [7]:

    
type(df.rating), type(df)









    Out[7]:





(pandas.core.series.Series, pandas.core.frame.DataFrame)

Querying

A spreadsheet is useless if you cant dice/sort/etc it. Here we look for all books with a rating less than 3.



In [8]:

    
df.rating < 3









    Out[8]:





0       False
1       False
2       False
3       False
4       False
5       False
6       False
7       False
8       False
9       False
10      False
11      False
12      False
13      False
14      False
15      False
16      False
17      False
18      False
19      False
20      False
21      False
22      False
23      False
24      False
25      False
26      False
27      False
28      False
29      False
        ...  
5970    False
5971    False
5972    False
5973    False
5974    False
5975    False
5976    False
5977    False
5978    False
5979     True
5980    False
5981    False
5982    False
5983    False
5984    False
5985    False
5986    False
5987    False
5988    False
5989    False
5990    False
5991    False
5992    False
5993    False
5994    False
5995    False
5996    False
5997    False
5998    False
5999    False
Name: rating, dtype: bool

This gives us Trues and Falses. Such a series is called a mask. If we count the number of Trues, and divide by the total, we'll get the fraction of ratings $\lt$ 3. To do this numerically see this:



In [30]:

    
np.mean(df.rating[df.rating < 60])









    Out[30]:





53.200000000000003

Why did that work?



In [10]:

    
print 1*True, 1*False

1 0

So we ought to be able to do this



In [11]:

    
np.sum(df.rating < 3)/df.shape[0]









    Out[11]:





0

But we get a 0? Why? In Python 2.x division is integer division by default. So one can fix by converting the df.shape[0] to a float



In [12]:

    
np.sum(df.rating < 3)/float(df.shape[0])









    Out[12]:





0.00066666666666666664

Notice that you could just find the average since the Trues map to 1s.



In [32]:

    
np.mean(df.rating < 50.0)









    Out[32]:





0.00016666666666666666

Or directly, in Pandas, which works since df.rating < 3 is a pandas Series.



In [16]:

    
(df.rating < 3).mean()









    Out[16]:





0.00066666666666666664

Filtering

Here are two ways to get a filtered dataframe



In [22]:

    
df.query("rating > 4.5")









    Out[22]:






  
    
      
      rating
      review_count
      isbn
      booktype
      author_url
      year
      genre_urls
      dir
      rating_count
      name
    
  
  
    
      17
      4.58
      1314
      0345538374
      good_reads:book
      https://www.goodreads.com/author/show/656983.J...
      1973
      /genres/fantasy|/genres/classics|/genres/scien...
      dir01/30.J_R_R_Tolkien_4_Book_Boxed_Set.html
      68495
      J.R.R. Tolkien 4-Book Boxed Set
    
    
      162
      4.55
      15777
      075640407X
      good_reads:book
      https://www.goodreads.com/author/show/108424.P...
      2007
      /genres/fantasy|/genres/fiction
      dir02/186074.The_Name_of_the_Wind.html
      210018
      The Name of the Wind (The Kingkiller Chronicle...
    
    
      222
      4.53
      15256
      055357342X
      good_reads:book
      https://www.goodreads.com/author/show/346732.G...
      2000
      /genres/fantasy|/genres/fiction|/genres/fantas...
      dir03/62291.A_Storm_of_Swords.html
      327992
      A Storm of Swords (A Song of Ice and Fire, #3)
    
    
      242
      4.53
      5404
      0545265355
      good_reads:book
      https://www.goodreads.com/author/show/153394.S...
      2010
      /genres/young-adult|/genres/fiction|/genres/fa...
      dir03/7938275-the-hunger-games-trilogy-boxset....
      102330
      The Hunger Games Trilogy Boxset (The Hunger Ga...
    
    
      249
      4.80
      644
      0740748475
      good_reads:book
      https://www.goodreads.com/author/show/13778.Bi...
      2005
      /genres/sequential-art|/genres/comics|/genres/...
      dir03/24812.The_Complete_Calvin_and_Hobbes.html
      22674
      The Complete Calvin and Hobbes
    
    
      284
      4.58
      15195
      1406321346
      good_reads:book
      https://www.goodreads.com/author/show/150038.C...
      2013
      /genres/fantasy|/genres/young-adult|/genres/fa...
      dir03/18335634-clockwork-princess.html
      130161
      Clockwork Princess (The Infernal Devices, #3)
    
    
      304
      4.54
      572
      0140259449
      good_reads:book
      https://www.goodreads.com/author/show/1265.Jan...
      1933
      /genres/classics|/genres/fiction|/genres/roman...
      dir04/14905.The_Complete_Novels.html
      17539
      The Complete Novels
    
    
      386
      4.55
      8820
      0756404738
      good_reads:book
      https://www.goodreads.com/author/show/108424.P...
      2011
      /genres/fantasy|/genres/fantasy|/genres/epic-f...
      dir04/1215032.The_Wise_Man_s_Fear.html
      142499
      The Wise Man's Fear (The Kingkiller Chronicle,...
    
    
      400
      4.53
      9292
      1423140605
      good_reads:book
      https://www.goodreads.com/author/show/15872.Ri...
      2012
      /genres/fantasy|/genres/young-adult|/genres/fa...
      dir05/12127750-the-mark-of-athena.html
      128412
      The Mark of Athena (The Heroes of Olympus, #3)
    
    
      475
      4.57
      824
      1416997857
      good_reads:book
      https://www.goodreads.com/author/show/150038.C...
      2009
      /genres/fantasy|/genres/young-adult|/genres/fa...
      dir05/6485421-the-mortal-instruments-boxed-set...
      39720
      The Mortal Instruments Boxed Set (The Mortal I...
    
    
      483
      4.59
      2622
      0312362153
      good_reads:book
      https://www.goodreads.com/author/show/4430.She...
      2008
      /genres/romance|/genres/paranormal-romance|/ge...
      dir05/2299110.Acheron.html
      35028
      Acheron (Dark-Hunter, #8)
    
    
      554
      4.54
      4809
      0385341679
      good_reads:book
      https://www.goodreads.com/author/show/48206.Ka...
      2011
      /genres/fantasy|/genres/urban-fantasy|/genres/...
      dir06/7304203-shadowfever.html
      52812
      Shadowfever (Fever, #5)
    
    
      577
      4.60
      5732
      0765326353
      good_reads:book
      https://www.goodreads.com/author/show/38550.Br...
      2010
      /genres/science-fiction-fantasy|/genres/fantas...
      dir06/7235533-the-way-of-kings.html
      76551
      The Way of Kings (The Stormlight Archive, #1)
    
    
      620
      4.54
      7767
      1423146727
      good_reads:book
      https://www.goodreads.com/author/show/15872.Ri...
      2013
      /genres/fantasy|/genres/young-adult|/genres/fa...
      dir07/12127810-the-house-of-hades.html
      72082
      The House of Hades (The Heroes of Olympus, #4)
    
    
      840
      4.57
      431
      1423113497
      good_reads:book
      https://www.goodreads.com/author/show/15872.Ri...
      2008
      /genres/fantasy|/genres/young-adult|/genres/fa...
      dir09/3165162-percy-jackson-and-the-olympians-...
      22937
      Percy Jackson and the Olympians Boxed Set (Per...
    
    
      883
      4.58
      558
      0140286802
      good_reads:book
      https://www.goodreads.com/author/show/500.Jorg...
      1998
      /genres/short-stories|/genres/literature|/genr...
      dir09/17961.Collected_Fictions.html
      12596
      Collected Fictions
    
    
      911
      4.85
      26
      1491732954
      good_reads:book
      https://www.goodreads.com/author/show/8189303....
      2014
      /genres/fiction
      dir10/22242097-honor-and-polygamy.html
      97
      Honor and Polygamy
    
    
      935
      4.64
      148
      1595142711
      good_reads:book
      https://www.goodreads.com/author/show/137902.R...
      2009
      /genres/paranormal|/genres/vampires|/genres/yo...
      dir10/6339989-vampire-academy-collection.html
      21743
      Vampire Academy Collection (Vampire Academy, #...
    
    
      938
      4.51
      11011
      1481426303
      good_reads:book
      https://www.goodreads.com/author/show/150038.C...
      2014
      /genres/fantasy|/genres/young-adult|/genres/fa...
      dir10/8755785-city-of-heavenly-fire.html
      69924
      City of Heavenly Fire (The Mortal Instruments,...
    
    
      953
      4.56
      27
      1477276068
      good_reads:book
      https://www.goodreads.com/author/show/6621980....
      2012
      NaN
      dir10/16243767-crossing-the-seas.html
      90
      Crossing the Seas
    
    
      958
      4.57
      38199
      0545010225
      good_reads:book
      https://www.goodreads.com/author/show/1077326....
      2007
      /genres/fantasy|/genres/young-adult|/genres/fa...
      dir10/136251.Harry_Potter_and_the_Deathly_Hall...
      1245866
      Harry Potter and the Deathly Hallows (Harry Po...
    
    
      1033
      4.56
      1304
      0007119550
      good_reads:book
      https://www.goodreads.com/author/show/346732.G...
      2000
      /genres/fiction|/genres/fantasy|/genres/epic-f...
      dir11/147915.A_Storm_of_Swords.html
      41161
      A Storm of Swords (A Song of Ice and Fire, #3-2)
    
    
      1109
      4.70
      23
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/7488658....
      2013
      /genres/romance
      dir12/19181419-a-bird-without-wings.html
      56
      A Bird Without Wings
    
    
      1127
      4.52
      644
      0141183047
      good_reads:book
      https://www.goodreads.com/author/show/7816.Fer...
      1982
      /genres/poetry|/genres/fiction|/genres/philoso...
      dir12/45974.The_Book_of_Disquiet.html
      7463
      The Book of Disquiet
    
    
      1151
      4.64
      84
      1491877928
      good_reads:book
      https://www.goodreads.com/author/show/7271860....
      2013
      /genres/war|/genres/historical-fiction|/genres...
      dir12/18501652-the-guardian-of-secrets-and-her...
      167
      The Guardian of Secrets and Her Deathly Pact
    
    
      1186
      4.51
      4853
      1619630621
      good_reads:book
      https://www.goodreads.com/author/show/3433047....
      2013
      /genres/fantasy|/genres/young-adult|/genres/ro...
      dir12/17167166-crown-of-midnight.html
      34142
      Crown of Midnight (Throne of Glass, #2)
    
    
      1202
      4.59
      1260
      0310902711
      good_reads:book
      https://www.goodreads.com/author/show/5158478....
      1972
      /genres/religion|/genres/christian|/genres/non...
      dir13/280111.Holy_Bible.html
      25584
      Holy Bible
    
    
      1260
      4.60
      1943
      0842377506
      good_reads:book
      https://www.goodreads.com/author/show/6492.Fra...
      1993
      /genres/christian-fiction|/genres/historical-f...
      dir13/95617.A_Voice_in_the_Wind.html
      37923
      A Voice in the Wind (Mark of the Lion, #1)
    
    
      1268
      4.52
      215
      1557091528
      good_reads:book
      https://www.goodreads.com/author/show/63859.Ja...
      1787
      /genres/history|/genres/non-fiction|/genres/po...
      dir13/89959.The_Constitution_of_the_United_Sta...
      12894
      The Constitution of the United States of America
    
    
      1300
      4.61
      24
      1499227299
      good_reads:book
      https://www.goodreads.com/author/show/7414345....
      2014
      /genres/paranormal|/genres/vampires|/genres/pa...
      dir14/22090082-vampire-princess-rising.html
      128
      Vampire Princess Rising (The Winters Family Sa...
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      5532
      4.86
      4
      1477504540
      good_reads:book
      https://www.goodreads.com/author/show/5989528....
      2013
      NaN
      dir56/17695243-call-of-the-lost-ages.html
      7
      Call Of The Lost Ages
    
    
      5549
      4.62
      13
      0882408704
      good_reads:book
      https://www.goodreads.com/author/show/947.Will...
      1899
      /genres/classics|/genres/fiction|/genres/poetr...
      dir56/17134346-the-complete-works-of-william-s...
      217
      The Complete Works of William Shakespeare
    
    
      5557
      4.61
      14
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/32401.Al...
      2006
      /genres/fantasy|/genres/young-adult
      dir56/13488552-the-books-of-pellinor.html
      394
      The Books of Pellinor
    
    
      5563
      4.70
      30
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/7153266....
      2014
      /genres/childrens
      dir56/20445451-children-s-book.html
      57
      Children's book
    
    
      5564
      5.00
      9
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/7738947....
      2014
      /genres/romance|/genres/new-adult
      dir56/21902777-untainted.html
      14
      Untainted (Photographer Trilogy, #3)
    
    
      5584
      4.75
      3
      1481959824
      good_reads:book
      https://www.goodreads.com/author/show/5100743....
      2013
      NaN
      dir56/17606460-why-not-world.html
      8
      Why Not-World
    
    
      5588
      4.66
      190
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/4942228....
      2011
      /genres/romance|/genres/m-m-romance|/genres/sc...
      dir56/11737700-fade.html
      996
      Fade (In the company of shadows, #4)
    
    
      5591
      4.58
      31
      1500118680
      good_reads:book
      https://www.goodreads.com/author/show/7738947....
      2014
      /genres/romance|/genres/new-adult
      dir56/22023804-logan-s-story.html
      45
      Logan's Story (Sand & Clay, #0.5)
    
    
      5601
      4.66
      312
      0842384898
      good_reads:book
      https://www.goodreads.com/author/show/5158478....
      1902
      /genres/christian|/genres/religion|/genres/non...
      dir57/930470.Holy_Bible.html
      2666
      Holy Bible
    
    
      5607
      4.66
      513
      0007444397
      good_reads:book
      https://www.goodreads.com/author/show/4659154....
      2011
      /genres/non-fiction|/genres/biography
      dir57/11792612-dare-to-dream.html
      5572
      Dare to Dream (100% Official)
    
    
      5619
      4.52
      462
      0991190920
      good_reads:book
      https://www.goodreads.com/author/show/7092218....
      2014
      /genres/fantasy|/genres/paranormal|/genres/fai...
      dir57/18188649-escaping-destiny.html
      3795
      Escaping Destiny (The Fae Chronicles, #3)
    
    
      5635
      4.54
      958
      0778315703
      good_reads:book
      https://www.goodreads.com/author/show/4480131....
      2013
      /genres/erotica|/genres/bdsm|/genres/adult-fic...
      dir57/17251444-the-mistress.html
      4869
      The Mistress (The Original Sinners, #4)
    
    
      5642
      4.70
      158
      1417642165
      good_reads:book
      https://www.goodreads.com/author/show/13778.Bi...
      1992
      /genres/sequential-art|/genres/comics|/genres/...
      dir57/70487.Calvin_and_Hobbes.html
      9224
      Calvin and Hobbes
    
    
      5657
      4.80
      8
      1469908530
      good_reads:book
      https://www.goodreads.com/author/show/4695431....
      2012
      /genres/fantasy
      dir57/15734769-myrtle-mae-and-the-mirror-in-th...
      10
      Myrtle Mae and the Mirror in the Attic (The Ma...
    
    
      5665
      4.53
      61
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/7738947....
      2014
      /genres/romance|/genres/new-adult|/genres/myst...
      dir57/20975446-tainted-pictures.html
      103
      Tainted Pictures (Photographer Trilogy, #2)
    
    
      5683
      4.56
      204
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/3097905....
      NaN
      /genres/fantasy|/genres/young-adult|/genres/ro...
      dir57/12474623-tiger-s-dream.html
      895
      Tiger's Dream (The Tiger Saga, #5)
    
    
      5692
      5.00
      0
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/5989528....
      2012
      NaN
      dir57/14288412-abstraction-in-theory---laws-of...
      6
      Abstraction In Theory - Laws Of Physical Trans...
    
    
      5716
      4.67
      34
      0810117134
      good_reads:book
      https://www.goodreads.com/author/show/205563.M...
      1970
      /genres/classics|/genres/fiction|/genres/histo...
      dir58/1679497.The_Fortress.html
      1335
      The Fortress
    
    
      5717
      4.71
      4
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/5838022....
      2012
      NaN
      dir58/13741511-american-amaranth.html
      14
      American Amaranth
    
    
      5718
      4.60
      656
      1613725132
      good_reads:book
      https://www.goodreads.com/author/show/1122775....
      2012
      /genres/romance|/genres/m-m-romance|/genres/ro...
      dir58/13246997-armed-dangerous.html
      5268
      Armed & Dangerous (Cut & Run, #5)
    
    
      5726
      4.55
      106
      1594170347
      good_reads:book
      https://www.goodreads.com/author/show/5158478....
      1952
      /genres/religion|/genres/reference|/genres/rel...
      dir58/147635.Holy_Bible.html
      1750
      Holy Bible
    
    
      5729
      4.83
      16
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/7058502....
      2014
      NaN
      dir58/22312293-the-keeper.html
      29
      The Keeper (The Keeper, #5)
    
    
      5753
      4.61
      811
      1937551865
      good_reads:book
      https://www.goodreads.com/author/show/1122775....
      2013
      /genres/romance|/genres/m-m-romance|/genres/ro...
      dir58/16159276-touch-geaux.html
      4212
      Touch & Geaux (Cut & Run, #7)
    
    
      5764
      4.54
      228
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/2112402....
      2013
      /genres/non-fiction|/genres/self-help|/genres/...
      dir58/18479831-staying-strong.html
      2343
      Staying Strong
    
    
      5778
      4.63
      0
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/4808225....
      2010
      NaN
      dir58/11187937-un-spoken.html
      19
      (Un) Spoken
    
    
      5806
      4.57
      121
      0679777458
      good_reads:book
      https://www.goodreads.com/author/show/8361.Dor...
      1966
      /genres/historical-fiction|/genres/fiction|/ge...
      dir59/351211.The_Disorderly_Knights.html
      2177
      The Disorderly Knights (The Lymond Chronicles,...
    
    
      5873
      4.55
      103
      144247372X
      good_reads:book
      https://www.goodreads.com/author/show/2876763....
      2012
      /genres/fantasy|/genres/paranormal|/genres/ang...
      dir59/14367071-the-complete-hush-hush-saga.html
      2869
      The Complete Hush, Hush Saga
    
    
      5874
      4.78
      18
      2851944371
      good_reads:book
      https://www.goodreads.com/author/show/318835.O...
      1972
      /genres/poetry|/genres/fiction|/genres/nobel-p...
      dir59/2014000.Le_Monogramme.html
      565
      Le Monogramme
    
    
      5880
      4.61
      123
      NaN
      good_reads:book
      https://www.goodreads.com/author/show/4942228....
      2010
      /genres/romance|/genres/m-m-romance|/genres/sc...
      dir59/10506860-the-interludes.html
      1031
      The Interludes (In the company of shadows, #3)
    
    
      5957
      4.72
      104
      178048044X
      good_reads:book
      https://www.goodreads.com/author/show/20248.J_...
      2010
      /genres/romance|/genres/paranormal|/genres/vam...
      dir60/10780042-j-r-ward-collection.html
      1788
      J. R. Ward Collection
    
  

224 rows × 10 columns

Here we create a mask and use it to "index" into the dataframe to get the rows we want.



In [37]:

    
df[df.year < 0]









    Out[37]:






  
    
      
      rating
      review_count
      isbn
      booktype
      author_url
      year
      genre_urls
      dir
      rating_count
      name
      author
    
  
  
    
      47
      3.68
      5785
      0143039954
      book
      https://www.goodreads.com/author/show/903.Homer
      -800
      /genres/classics|/genres/fiction|/genres/poetr...
      dir01/1381.The_Odyssey.html
      560248
      The Odyssey
      Homer
    
    
      246
      4.01
      365
      0147712556
      book
      https://www.goodreads.com/author/show/903.Homer
      -800
      /genres/classics|/genres/fantasy|/genres/mytho...
      dir03/1375.The_Iliad_The_Odyssey.html
      35123
      The Iliad/The Odyssey
      Homer
    
    
      455
      3.85
      1499
      0140449140
      book
      https://www.goodreads.com/author/show/879.Plato
      -380
      /genres/philosophy|/genres/classics|/genres/no...
      dir05/30289.The_Republic.html
      82022
      The Republic
      Plato
    
    
      596
      3.77
      1240
      0679729526
      book
      https://www.goodreads.com/author/show/919.Virgil
      -29
      /genres/classics|/genres/poetry|/genres/fictio...
      dir06/12914.The_Aeneid.html
      60308
      The Aeneid
      Virgil
    
    
      629
      3.64
      1231
      1580495931
      book
      https://www.goodreads.com/author/show/1002.Sop...
      -429
      /genres/classics|/genres/plays|/genres/drama|/...
      dir07/1554.Oedipus_Rex.html
      93192
      Oedipus Rex
      Sophocles
    
    
      674
      3.92
      3559
      1590302257
      book
      https://www.goodreads.com/author/show/1771.Sun...
      -512
      /genres/non-fiction|/genres/politics|/genres/c...
      dir07/10534.The_Art_of_War.html
      114619
      The Art of War
      Sun_Tzu
    
    
      746
      4.06
      1087
      0140449183
      book
      https://www.goodreads.com/author/show/5158478....
      -500
      /genres/classics|/genres/spirituality|/genres/...
      dir08/99944.The_Bhagavad_Gita.html
      31634
      The Bhagavad Gita
      Anonymous
    
    
      777
      3.52
      1038
      1580493882
      book
      https://www.goodreads.com/author/show/1002.Sop...
      -442
      /genres/drama|/genres/fiction|/genres/classics...
      dir08/7728.Antigone.html
      49084
      Antigone
      Sophocles
    
    
      1233
      3.94
      704
      015602764X
      book
      https://www.goodreads.com/author/show/1002.Sop...
      -400
      /genres/classics|/genres/plays|/genres/drama|/...
      dir13/1540.The_Oedipus_Cycle.html
      36008
      The Oedipus Cycle
      Sophocles
    
    
      1397
      4.03
      890
      0192840509
      book
      https://www.goodreads.com/author/show/12452.Aesop
      -560
      /genres/classics|/genres/childrens|/genres/lit...
      dir14/21348.Aesop_s_Fables.html
      71259
      Aesop's Fables
      Aesop
    
    
      1398
      3.60
      1644
      0141026286
      book
      https://www.goodreads.com/author/show/5158478....
      -1500
      /genres/religion|/genres/literature|/genres/an...
      dir14/19351.The_Epic_of_Gilgamesh.html
      42026
      The Epic of Gilgamesh
      Anonymous
    
    
      1428
      3.80
      539
      0486275485
      book
      https://www.goodreads.com/author/show/973.Euri...
      -431
      /genres/classics|/genres/plays|/genres/drama|/...
      dir15/752900.Medea.html
      29858
      Medea
      Euripides
    
    
      1815
      3.96
      493
      0140443339
      book
      https://www.goodreads.com/author/show/990.Aesc...
      -458
      /genres/classics|/genres/plays|/genres/drama|/...
      dir19/1519.The_Oresteia.html
      18729
      The Oresteia
      Aeschylus
    
    
      1882
      4.02
      377
      0872205541
      book
      https://www.goodreads.com/author/show/879.Plato
      -400
      /genres/philosophy|/genres/classics|/genres/no...
      dir19/22632.The_Trial_and_Death_of_Socrates.html
      18712
      The Trial and Death of Socrates
      Plato
    
    
      2078
      3.84
      399
      0140440399
      book
      https://www.goodreads.com/author/show/957.Thuc...
      -411
      /genres/history|/genres/classics|/genres/non-f...
      dir21/261243.The_History_of_the_Peloponnesian_...
      17212
      The History of the Peloponnesian War
      Thucydides
    
    
      2527
      3.94
      506
      0140449086
      book
      https://www.goodreads.com/author/show/901.Hero...
      -440
      /genres/history|/genres/classics|/genres/non-f...
      dir26/1362.The_Histories.html
      20570
      The Histories
      Herodotus
    
    
      3133
      4.30
      131
      0872203492
      book
      https://www.goodreads.com/author/show/879.Plato
      -400
      /genres/philosophy|/genres/classics|/genres/no...
      dir32/9462.Complete_Works.html
      7454
      Complete Works
      Plato
    
    
      3274
      3.88
      411
      0140449493
      book
      https://www.goodreads.com/author/show/2192.Ari...
      -350
      /genres/philosophy|/genres/classics|/genres/no...
      dir33/19068.The_Nicomachean_Ethics.html
      16534
      The Nicomachean Ethics
      Aristotle
    
    
      3757
      3.82
      364
      0872206033
      book
      https://www.goodreads.com/author/show/1011.Ari...
      -411
      /genres/plays|/genres/classics|/genres/drama|/...
      dir38/1591.Lysistrata.html
      18070
      Lysistrata
      Aristophanes
    
    
      4402
      3.99
      516
      0140449272
      book
      https://www.goodreads.com/author/show/879.Plato
      -370
      /genres/non-fiction|/genres/classics|/genres/p...
      dir45/81779.The_Symposium.html
      18457
      The Symposium
      Plato
    
    
      4475
      4.11
      281
      0865163480
      book
      https://www.goodreads.com/author/show/879.Plato
      -390
      /genres/philosophy|/genres/classics|/genres/no...
      dir45/73945.Apology.html
      11478
      Apology
      Plato
    
    
      5367
      4.07
      133
      0872206335
      book
      https://www.goodreads.com/author/show/879.Plato
      -360
      /genres/philosophy|/genres/classics|/genres/no...
      dir54/30292.Five_Dialogues.html
      9964
      Five Dialogues
      Plato

If you want to combine these conditions, use the second form and put '()' brackets around each condition. The query uses a boolean AND. Each condition ceates a mask of trues and falses.



In [19]:

    
df[(df.year < 0) & (df.rating > 4)]#there were none greater than 4.5!









    Out[19]:






  
    
      
      rating
      review_count
      isbn
      booktype
      author_url
      year
      genre_urls
      dir
      rating_count
      name
    
  
  
    
      246
      4.01
      365
      0147712556
      good_reads:book
      https://www.goodreads.com/author/show/903.Homer
      -800
      /genres/classics|/genres/fantasy|/genres/mytho...
      dir03/1375.The_Iliad_The_Odyssey.html
      35123
      The Iliad/The Odyssey
    
    
      746
      4.06
      1087
      0140449183
      good_reads:book
      https://www.goodreads.com/author/show/5158478....
      -500
      /genres/classics|/genres/spirituality|/genres/...
      dir08/99944.The_Bhagavad_Gita.html
      31634
      The Bhagavad Gita
    
    
      1397
      4.03
      890
      0192840509
      good_reads:book
      https://www.goodreads.com/author/show/12452.Aesop
      -560
      /genres/classics|/genres/childrens|/genres/lit...
      dir14/21348.Aesop_s_Fables.html
      71259
      Aesop's Fables
    
    
      1882
      4.02
      377
      0872205541
      good_reads:book
      https://www.goodreads.com/author/show/879.Plato
      -400
      /genres/philosophy|/genres/classics|/genres/no...
      dir19/22632.The_Trial_and_Death_of_Socrates.html
      18712
      The Trial and Death of Socrates
    
    
      3133
      4.30
      131
      0872203492
      good_reads:book
      https://www.goodreads.com/author/show/879.Plato
      -400
      /genres/philosophy|/genres/classics|/genres/no...
      dir32/9462.Complete_Works.html
      7454
      Complete Works
    
    
      4475
      4.11
      281
      0865163480
      good_reads:book
      https://www.goodreads.com/author/show/879.Plato
      -390
      /genres/philosophy|/genres/classics|/genres/no...
      dir45/73945.Apology.html
      11478
      Apology
    
    
      5367
      4.07
      133
      0872206335
      good_reads:book
      https://www.goodreads.com/author/show/879.Plato
      -360
      /genres/philosophy|/genres/classics|/genres/no...
      dir54/30292.Five_Dialogues.html
      9964
      Five Dialogues

Cleaning

We first check the datatypes. Notice that review_count, rating_count are of type object (which means they are either strings or Pandas couldnt figure what they are), while year is a float.



In [20]:

    
df.dtypes









    Out[20]:





rating          float64
review_count     object
isbn             object
booktype         object
author_url       object
year            float64
genre_urls       object
dir              object
rating_count     object
name             object
dtype: object

Suppose we try and fix this



In [5]:

    
df['rating_count']=df.rating_count.astype(int)
df['review_count']=df.review_count.astype(int)
df['year']=df.year.astype(int)









    



---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-8bf38ae9d108> in <module>()
----> 1 df['rating_count']=df.rating_count.astype(int)
      2 df['review_count']=df.review_count.astype(int)
      3 df['year']=df.year.astype(int)

/Users/stevenydc/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in astype(self, dtype, copy, raise_on_error, **kwargs)
   2409 
   2410         mgr = self._data.astype(
-> 2411             dtype=dtype, copy=copy, raise_on_error=raise_on_error, **kwargs)
   2412         return self._constructor(mgr).__finalize__(self)
   2413 

/Users/stevenydc/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in astype(self, dtype, **kwargs)
   2502 
   2503     def astype(self, dtype, **kwargs):
-> 2504         return self.apply('astype', dtype=dtype, **kwargs)
   2505 
   2506     def convert(self, **kwargs):

/Users/stevenydc/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in apply(self, f, axes, filter, do_integrity_check, **kwargs)
   2457                                                  copy=align_copy)
   2458 
-> 2459             applied = getattr(b, f)(**kwargs)
   2460 
   2461             if isinstance(applied, list):

/Users/stevenydc/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in astype(self, dtype, copy, raise_on_error, values, **kwargs)
    371     def astype(self, dtype, copy=False, raise_on_error=True, values=None, **kwargs):
    372         return self._astype(dtype, copy=copy, raise_on_error=raise_on_error,
--> 373                             values=values, **kwargs)
    374 
    375     def _astype(self, dtype, copy=False, raise_on_error=True, values=None,

/Users/stevenydc/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in _astype(self, dtype, copy, raise_on_error, values, klass, **kwargs)
    401             if values is None:
    402                 # _astype_nansafe works fine with 1-d only
--> 403                 values = com._astype_nansafe(self.values.ravel(), dtype, copy=True)
    404                 values = values.reshape(self.values.shape)
    405             newb = make_block(values,

/Users/stevenydc/anaconda/lib/python2.7/site-packages/pandas/core/common.pyc in _astype_nansafe(arr, dtype, copy)
   2729     elif arr.dtype == np.object_ and np.issubdtype(dtype.type, np.integer):
   2730         # work around NumPy brokenness, #1987
-> 2731         return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
   2732 
   2733     if copy:

pandas/lib.pyx in pandas.lib.astype_intsafe (pandas/lib.c:14844)()

pandas/src/util.pxd in util.set_value_at (pandas/lib.c:63086)()

ValueError: invalid literal for long() with base 10: 'None'

Oppos we got an error. Something is not right. Its trying to convert some python datatype: None into an int. This usually means data was missing. Was it?



In [10]:

    
df[df['year'].isnull()]









    Out[10]:






  
    
      
      rating
      review_count
      isbn
      booktype
      author_url
      year
      genre_urls
      dir
      rating_count
      name

Aha, we had some incomplete data. Lets get rid of it



In [9]:

    
df = df[df.year.notnull()]
df.shape









    Out[9]:





(5993, 10)

We removed those 7 rows. Lets try the type conversion again



In [11]:

    
df['rating_count']=df.rating_count.astype(int)
df['review_count']=df.review_count.astype(int)
df['year']=df.year.astype(int)



In [12]:

    
df.dtypes









    Out[12]:





rating          float64
review_count      int64
isbn             object
booktype         object
author_url       object
year              int64
genre_urls       object
dir              object
rating_count      int64
name             object
dtype: object

Much cleaner now!



In [49]:

    
df.rating_count.unique()









    Out[49]:





array([2958974, 1284478, 2579564, ...,    2971,    3083,    3982])

Visualizing

Pandas has handy built in visualization.



In [48]:

    
df.rating.hist();

We can do this in more detail, plotting against a mean, with cutom binsize or number of bins. Note how to label axes and create legends.



In [21]:

    
sns.set_context("poster")
meanrat=df.rating.mean()
#you can get means and medians in different ways
print meanrat, np.mean(df.rating), df.rating.median()
with sns.axes_style("ticks"):
    df.rating.hist(bins=30, alpha=0.9);
    plt.axvline(meanrat, 0.1, 1, color='r', label='Mean')
    plt.xlabel("average rating of book")
    plt.ylabel("Counts")
    plt.title("Ratings Histogram")
    plt.legend()
    #sns.despine()









    



80.8399466044 80.8399466044 81.0

One can see the sparseness of review counts. This will be important when we learn about recommendations: we'll have to regularize our models to deal with it.



In [24]:

    
df.review_count.hist(bins=100)
plt.xscale('log');

The structure may be easier to see if we rescale the x-axis to be logarithmic.



In [35]:

    
df.review_count.hist(bins=100)
plt.xscale("log");

Here we make a scatterplot in matplotlib of rating against year. By setting the alpha transparency low we can how the density of highly rated books on goodreads has changed.



In [31]:

    
plt.scatter(df.year, df.rating, lw=2, alpha=.2)
plt.xlim([1900,2010])
plt.xlabel("Year")
plt.ylabel("Rating")









    Out[31]:





<matplotlib.text.Text at 0x10b221290>

Pythons and ducks

Notice that we used the series in the x-list and y-list slots in the scatter function in the plt module.

In working with python I always remember: a python is a duck.

What I mean is, python has a certain way of doing things. For example lets call one of these ways listiness. Listiness works on lists, dictionaries, files, and a general notion of something called an iterator.

A Pandas series plays like a python list:



In [36]:

    
alist=[1,2,3,4,5]

We can construct another list by using the syntax below, also called a list comprehension.



In [29]:

    
asquaredlist=[i*i for i in alist]
asquaredlist









    Out[29]:





[1, 4, 9, 16, 25]

And then we can again make a scatterplot



In [30]:

    
plt.scatter(alist, asquaredlist);



In [31]:

    
print type(alist)









    



<type 'list'>

In other words, something is a duck if it quacks like a duck. A Pandas series quacks like a python list. They both support something called the iterator protocol, an notion of behaving in a "listy" way. And Python functions like plt.scatter will accept anything that behaves listy. Indeed here's one more example:



In [34]:

    
plt.hist(df.rating_count.values, bins=100, alpha=0.5);



In [35]:

    
print type(df.rating_count), type(df.rating_count.values)









    



<class 'pandas.core.series.Series'> <type 'numpy.ndarray'>

Series and numpy lists behave similarly as well.

Vectorization

Numpy arrays are a bit different from regular python lists, and are the bread and butter of data science. Pandas Series are built atop them.



In [37]:

    
alist + alist









    Out[37]:





[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]



In [38]:

    
np.array(alist)









    Out[38]:





array([1, 2, 3, 4, 5])



In [39]:

    
np.array(alist)+np.array(alist)









    Out[39]:





array([ 2,  4,  6,  8, 10])



In [40]:

    
np.array(alist)**2









    Out[40]:





array([ 1,  4,  9, 16, 25])

In other words, operations on numpy arrays, and by extension, Pandas Series, are vectorized. You can add two numpy lists by just using + whereas the result isnt what you might expect for regular python lists. To add regular python lists elementwise, you will need to use a loop:



In [40]:

    
newlist=[]
for item in alist:
    newlist.append(item+item)
newlist









    Out[40]:





[2, 4, 6, 8, 10]

Vectorization is a powerful idiom, and we will use it a lot in this class. And, for almost all data intensive computing, we will use numpy arrays rather than python lists, as the python numerical stack is based on it.

You have seen this in idea in spreadsheets where you add an entire column to another one.

Two final examples



In [41]:

    
a=np.array([1,2,3,4,5])
print type(a)
b=np.array([1,2,3,4,5])

print a*b









    



<type 'numpy.ndarray'>
[ 1  4  9 16 25]



In [42]:

    
a+1









    Out[42]:





array([2, 3, 4, 5, 6])

	rating	review_count	isbn	booktype	author_url	year	genre_urls	dir	rating_count	name
0	88.0	136455	0439023483	good_reads:book	https://www.goodreads.com/author/show/153394.S...	2008	/genres/young-adult\|/genres/science-fiction\|/g...	dir01/2767052-the-hunger-games.html	2958974	The Hunger Games (The Hunger Games, #1)
1	88.2	16648	0439358078	good_reads:book	https://www.goodreads.com/author/show/1077326....	2003	/genres/fantasy\|/genres/young-adult\|/genres/fi...	dir01/2.Harry_Potter_and_the_Order_of_the_Phoe...	1284478	Harry Potter and the Order of the Phoenix (Har...
2	71.2	85746	0316015849	good_reads:book	https://www.goodreads.com/author/show/941441.S...	2005	/genres/young-adult\|/genres/fantasy\|/genres/ro...	dir01/41865.Twilight.html	2579564	Twilight (Twilight, #1)
3	84.6	47906	0061120081	good_reads:book	https://www.goodreads.com/author/show/1825.Har...	1960	/genres/classics\|/genres/fiction\|/genres/histo...	dir01/2657.To_Kill_a_Mockingbird.html	2078123	To Kill a Mockingbird
4	84.6	34772	0679783261	good_reads:book	https://www.goodreads.com/author/show/1265.Jan...	1813	/genres/classics\|/genres/fiction\|/genres/roman...	dir01/1885.Pride_and_Prejudice.html	1388992	Pride and Prejudice

	rating	review_count	isbn	booktype	author_url	year	genre_urls	dir	rating_count	name
17	4.58	1314	0345538374	good_reads:book	https://www.goodreads.com/author/show/656983.J...	1973	/genres/fantasy\|/genres/classics\|/genres/scien...	dir01/30.J_R_R_Tolkien_4_Book_Boxed_Set.html	68495	J.R.R. Tolkien 4-Book Boxed Set
162	4.55	15777	075640407X	good_reads:book	https://www.goodreads.com/author/show/108424.P...	2007	/genres/fantasy\|/genres/fiction	dir02/186074.The_Name_of_the_Wind.html	210018	The Name of the Wind (The Kingkiller Chronicle...
222	4.53	15256	055357342X	good_reads:book	https://www.goodreads.com/author/show/346732.G...	2000	/genres/fantasy\|/genres/fiction\|/genres/fantas...	dir03/62291.A_Storm_of_Swords.html	327992	A Storm of Swords (A Song of Ice and Fire, #3)
242	4.53	5404	0545265355	good_reads:book	https://www.goodreads.com/author/show/153394.S...	2010	/genres/young-adult\|/genres/fiction\|/genres/fa...	dir03/7938275-the-hunger-games-trilogy-boxset....	102330	The Hunger Games Trilogy Boxset (The Hunger Ga...
249	4.80	644	0740748475	good_reads:book	https://www.goodreads.com/author/show/13778.Bi...	2005	/genres/sequential-art\|/genres/comics\|/genres/...	dir03/24812.The_Complete_Calvin_and_Hobbes.html	22674	The Complete Calvin and Hobbes
284	4.58	15195	1406321346	good_reads:book	https://www.goodreads.com/author/show/150038.C...	2013	/genres/fantasy\|/genres/young-adult\|/genres/fa...	dir03/18335634-clockwork-princess.html	130161	Clockwork Princess (The Infernal Devices, #3)
304	4.54	572	0140259449	good_reads:book	https://www.goodreads.com/author/show/1265.Jan...	1933	/genres/classics\|/genres/fiction\|/genres/roman...	dir04/14905.The_Complete_Novels.html	17539	The Complete Novels
386	4.55	8820	0756404738	good_reads:book	https://www.goodreads.com/author/show/108424.P...	2011	/genres/fantasy\|/genres/fantasy\|/genres/epic-f...	dir04/1215032.The_Wise_Man_s_Fear.html	142499	The Wise Man's Fear (The Kingkiller Chronicle,...
400	4.53	9292	1423140605	good_reads:book	https://www.goodreads.com/author/show/15872.Ri...	2012	/genres/fantasy\|/genres/young-adult\|/genres/fa...	dir05/12127750-the-mark-of-athena.html	128412	The Mark of Athena (The Heroes of Olympus, #3)
475	4.57	824	1416997857	good_reads:book	https://www.goodreads.com/author/show/150038.C...	2009	/genres/fantasy\|/genres/young-adult\|/genres/fa...	dir05/6485421-the-mortal-instruments-boxed-set...	39720	The Mortal Instruments Boxed Set (The Mortal I...
483	4.59	2622	0312362153	good_reads:book	https://www.goodreads.com/author/show/4430.She...	2008	/genres/romance\|/genres/paranormal-romance\|/ge...	dir05/2299110.Acheron.html	35028	Acheron (Dark-Hunter, #8)
554	4.54	4809	0385341679	good_reads:book	https://www.goodreads.com/author/show/48206.Ka...	2011	/genres/fantasy\|/genres/urban-fantasy\|/genres/...	dir06/7304203-shadowfever.html	52812	Shadowfever (Fever, #5)
577	4.60	5732	0765326353	good_reads:book	https://www.goodreads.com/author/show/38550.Br...	2010	/genres/science-fiction-fantasy\|/genres/fantas...	dir06/7235533-the-way-of-kings.html	76551	The Way of Kings (The Stormlight Archive, #1)
620	4.54	7767	1423146727	good_reads:book	https://www.goodreads.com/author/show/15872.Ri...	2013	/genres/fantasy\|/genres/young-adult\|/genres/fa...	dir07/12127810-the-house-of-hades.html	72082	The House of Hades (The Heroes of Olympus, #4)
840	4.57	431	1423113497	good_reads:book	https://www.goodreads.com/author/show/15872.Ri...	2008	/genres/fantasy\|/genres/young-adult\|/genres/fa...	dir09/3165162-percy-jackson-and-the-olympians-...	22937	Percy Jackson and the Olympians Boxed Set (Per...
883	4.58	558	0140286802	good_reads:book	https://www.goodreads.com/author/show/500.Jorg...	1998	/genres/short-stories\|/genres/literature\|/genr...	dir09/17961.Collected_Fictions.html	12596	Collected Fictions
911	4.85	26	1491732954	good_reads:book	https://www.goodreads.com/author/show/8189303....	2014	/genres/fiction	dir10/22242097-honor-and-polygamy.html	97	Honor and Polygamy
935	4.64	148	1595142711	good_reads:book	https://www.goodreads.com/author/show/137902.R...	2009	/genres/paranormal\|/genres/vampires\|/genres/yo...	dir10/6339989-vampire-academy-collection.html	21743	Vampire Academy Collection (Vampire Academy, #...
938	4.51	11011	1481426303	good_reads:book	https://www.goodreads.com/author/show/150038.C...	2014	/genres/fantasy\|/genres/young-adult\|/genres/fa...	dir10/8755785-city-of-heavenly-fire.html	69924	City of Heavenly Fire (The Mortal Instruments,...
953	4.56	27	1477276068	good_reads:book	https://www.goodreads.com/author/show/6621980....	2012	NaN	dir10/16243767-crossing-the-seas.html	90	Crossing the Seas
958	4.57	38199	0545010225	good_reads:book	https://www.goodreads.com/author/show/1077326....	2007	/genres/fantasy\|/genres/young-adult\|/genres/fa...	dir10/136251.Harry_Potter_and_the_Deathly_Hall...	1245866	Harry Potter and the Deathly Hallows (Harry Po...
1033	4.56	1304	0007119550	good_reads:book	https://www.goodreads.com/author/show/346732.G...	2000	/genres/fiction\|/genres/fantasy\|/genres/epic-f...	dir11/147915.A_Storm_of_Swords.html	41161	A Storm of Swords (A Song of Ice and Fire, #3-2)
1109	4.70	23	NaN	good_reads:book	https://www.goodreads.com/author/show/7488658....	2013	/genres/romance	dir12/19181419-a-bird-without-wings.html	56	A Bird Without Wings
1127	4.52	644	0141183047	good_reads:book	https://www.goodreads.com/author/show/7816.Fer...	1982	/genres/poetry\|/genres/fiction\|/genres/philoso...	dir12/45974.The_Book_of_Disquiet.html	7463	The Book of Disquiet
1151	4.64	84	1491877928	good_reads:book	https://www.goodreads.com/author/show/7271860....	2013	/genres/war\|/genres/historical-fiction\|/genres...	dir12/18501652-the-guardian-of-secrets-and-her...	167	The Guardian of Secrets and Her Deathly Pact
1186	4.51	4853	1619630621	good_reads:book	https://www.goodreads.com/author/show/3433047....	2013	/genres/fantasy\|/genres/young-adult\|/genres/ro...	dir12/17167166-crown-of-midnight.html	34142	Crown of Midnight (Throne of Glass, #2)
1202	4.59	1260	0310902711	good_reads:book	https://www.goodreads.com/author/show/5158478....	1972	/genres/religion\|/genres/christian\|/genres/non...	dir13/280111.Holy_Bible.html	25584	Holy Bible
1260	4.60	1943	0842377506	good_reads:book	https://www.goodreads.com/author/show/6492.Fra...	1993	/genres/christian-fiction\|/genres/historical-f...	dir13/95617.A_Voice_in_the_Wind.html	37923	A Voice in the Wind (Mark of the Lion, #1)
1268	4.52	215	1557091528	good_reads:book	https://www.goodreads.com/author/show/63859.Ja...	1787	/genres/history\|/genres/non-fiction\|/genres/po...	dir13/89959.The_Constitution_of_the_United_Sta...	12894	The Constitution of the United States of America
1300	4.61	24	1499227299	good_reads:book	https://www.goodreads.com/author/show/7414345....	2014	/genres/paranormal\|/genres/vampires\|/genres/pa...	dir14/22090082-vampire-princess-rising.html	128	Vampire Princess Rising (The Winters Family Sa...
...	...	...	...	...	...	...	...	...	...	...
5532	4.86	4	1477504540	good_reads:book	https://www.goodreads.com/author/show/5989528....	2013	NaN	dir56/17695243-call-of-the-lost-ages.html	7	Call Of The Lost Ages
5549	4.62	13	0882408704	good_reads:book	https://www.goodreads.com/author/show/947.Will...	1899	/genres/classics\|/genres/fiction\|/genres/poetr...	dir56/17134346-the-complete-works-of-william-s...	217	The Complete Works of William Shakespeare
5557	4.61	14	NaN	good_reads:book	https://www.goodreads.com/author/show/32401.Al...	2006	/genres/fantasy\|/genres/young-adult	dir56/13488552-the-books-of-pellinor.html	394	The Books of Pellinor
5563	4.70	30	NaN	good_reads:book	https://www.goodreads.com/author/show/7153266....	2014	/genres/childrens	dir56/20445451-children-s-book.html	57	Children's book
5564	5.00	9	NaN	good_reads:book	https://www.goodreads.com/author/show/7738947....	2014	/genres/romance\|/genres/new-adult	dir56/21902777-untainted.html	14	Untainted (Photographer Trilogy, #3)
5584	4.75	3	1481959824	good_reads:book	https://www.goodreads.com/author/show/5100743....	2013	NaN	dir56/17606460-why-not-world.html	8	Why Not-World
5588	4.66	190	NaN	good_reads:book	https://www.goodreads.com/author/show/4942228....	2011	/genres/romance\|/genres/m-m-romance\|/genres/sc...	dir56/11737700-fade.html	996	Fade (In the company of shadows, #4)
5591	4.58	31	1500118680	good_reads:book	https://www.goodreads.com/author/show/7738947....	2014	/genres/romance\|/genres/new-adult	dir56/22023804-logan-s-story.html	45	Logan's Story (Sand & Clay, #0.5)
5601	4.66	312	0842384898	good_reads:book	https://www.goodreads.com/author/show/5158478....	1902	/genres/christian\|/genres/religion\|/genres/non...	dir57/930470.Holy_Bible.html	2666	Holy Bible
5607	4.66	513	0007444397	good_reads:book	https://www.goodreads.com/author/show/4659154....	2011	/genres/non-fiction\|/genres/biography	dir57/11792612-dare-to-dream.html	5572	Dare to Dream (100% Official)
5619	4.52	462	0991190920	good_reads:book	https://www.goodreads.com/author/show/7092218....	2014	/genres/fantasy\|/genres/paranormal\|/genres/fai...	dir57/18188649-escaping-destiny.html	3795	Escaping Destiny (The Fae Chronicles, #3)
5635	4.54	958	0778315703	good_reads:book	https://www.goodreads.com/author/show/4480131....	2013	/genres/erotica\|/genres/bdsm\|/genres/adult-fic...	dir57/17251444-the-mistress.html	4869	The Mistress (The Original Sinners, #4)
5642	4.70	158	1417642165	good_reads:book	https://www.goodreads.com/author/show/13778.Bi...	1992	/genres/sequential-art\|/genres/comics\|/genres/...	dir57/70487.Calvin_and_Hobbes.html	9224	Calvin and Hobbes
5657	4.80	8	1469908530	good_reads:book	https://www.goodreads.com/author/show/4695431....	2012	/genres/fantasy	dir57/15734769-myrtle-mae-and-the-mirror-in-th...	10	Myrtle Mae and the Mirror in the Attic (The Ma...
5665	4.53	61	NaN	good_reads:book	https://www.goodreads.com/author/show/7738947....	2014	/genres/romance\|/genres/new-adult\|/genres/myst...	dir57/20975446-tainted-pictures.html	103	Tainted Pictures (Photographer Trilogy, #2)
5683	4.56	204	NaN	good_reads:book	https://www.goodreads.com/author/show/3097905....	NaN	/genres/fantasy\|/genres/young-adult\|/genres/ro...	dir57/12474623-tiger-s-dream.html	895	Tiger's Dream (The Tiger Saga, #5)
5692	5.00	0	NaN	good_reads:book	https://www.goodreads.com/author/show/5989528....	2012	NaN	dir57/14288412-abstraction-in-theory---laws-of...	6	Abstraction In Theory - Laws Of Physical Trans...
5716	4.67	34	0810117134	good_reads:book	https://www.goodreads.com/author/show/205563.M...	1970	/genres/classics\|/genres/fiction\|/genres/histo...	dir58/1679497.The_Fortress.html	1335	The Fortress
5717	4.71	4	NaN	good_reads:book	https://www.goodreads.com/author/show/5838022....	2012	NaN	dir58/13741511-american-amaranth.html	14	American Amaranth
5718	4.60	656	1613725132	good_reads:book	https://www.goodreads.com/author/show/1122775....	2012	/genres/romance\|/genres/m-m-romance\|/genres/ro...	dir58/13246997-armed-dangerous.html	5268	Armed & Dangerous (Cut & Run, #5)
5726	4.55	106	1594170347	good_reads:book	https://www.goodreads.com/author/show/5158478....	1952	/genres/religion\|/genres/reference\|/genres/rel...	dir58/147635.Holy_Bible.html	1750	Holy Bible
5729	4.83	16	NaN	good_reads:book	https://www.goodreads.com/author/show/7058502....	2014	NaN	dir58/22312293-the-keeper.html	29	The Keeper (The Keeper, #5)
5753	4.61	811	1937551865	good_reads:book	https://www.goodreads.com/author/show/1122775....	2013	/genres/romance\|/genres/m-m-romance\|/genres/ro...	dir58/16159276-touch-geaux.html	4212	Touch & Geaux (Cut & Run, #7)
5764	4.54	228	NaN	good_reads:book	https://www.goodreads.com/author/show/2112402....	2013	/genres/non-fiction\|/genres/self-help\|/genres/...	dir58/18479831-staying-strong.html	2343	Staying Strong
5778	4.63	0	NaN	good_reads:book	https://www.goodreads.com/author/show/4808225....	2010	NaN	dir58/11187937-un-spoken.html	19	(Un) Spoken
5806	4.57	121	0679777458	good_reads:book	https://www.goodreads.com/author/show/8361.Dor...	1966	/genres/historical-fiction\|/genres/fiction\|/ge...	dir59/351211.The_Disorderly_Knights.html	2177	The Disorderly Knights (The Lymond Chronicles,...
5873	4.55	103	144247372X	good_reads:book	https://www.goodreads.com/author/show/2876763....	2012	/genres/fantasy\|/genres/paranormal\|/genres/ang...	dir59/14367071-the-complete-hush-hush-saga.html	2869	The Complete Hush, Hush Saga
5874	4.78	18	2851944371	good_reads:book	https://www.goodreads.com/author/show/318835.O...	1972	/genres/poetry\|/genres/fiction\|/genres/nobel-p...	dir59/2014000.Le_Monogramme.html	565	Le Monogramme
5880	4.61	123	NaN	good_reads:book	https://www.goodreads.com/author/show/4942228....	2010	/genres/romance\|/genres/m-m-romance\|/genres/sc...	dir59/10506860-the-interludes.html	1031	The Interludes (In the company of shadows, #3)
5957	4.72	104	178048044X	good_reads:book	https://www.goodreads.com/author/show/20248.J_...	2010	/genres/romance\|/genres/paranormal\|/genres/vam...	dir60/10780042-j-r-ward-collection.html	1788	J. R. Ward Collection

	rating	review_count	isbn	booktype	author_url	year	genre_urls	dir	rating_count	name	author
47	3.68	5785	0143039954	book	https://www.goodreads.com/author/show/903.Homer	-800	/genres/classics\|/genres/fiction\|/genres/poetr...	dir01/1381.The_Odyssey.html	560248	The Odyssey	Homer
246	4.01	365	0147712556	book	https://www.goodreads.com/author/show/903.Homer	-800	/genres/classics\|/genres/fantasy\|/genres/mytho...	dir03/1375.The_Iliad_The_Odyssey.html	35123	The Iliad/The Odyssey	Homer
455	3.85	1499	0140449140	book	https://www.goodreads.com/author/show/879.Plato	-380	/genres/philosophy\|/genres/classics\|/genres/no...	dir05/30289.The_Republic.html	82022	The Republic	Plato
596	3.77	1240	0679729526	book	https://www.goodreads.com/author/show/919.Virgil	-29	/genres/classics\|/genres/poetry\|/genres/fictio...	dir06/12914.The_Aeneid.html	60308	The Aeneid	Virgil
629	3.64	1231	1580495931	book	https://www.goodreads.com/author/show/1002.Sop...	-429	/genres/classics\|/genres/plays\|/genres/drama\|/...	dir07/1554.Oedipus_Rex.html	93192	Oedipus Rex	Sophocles
674	3.92	3559	1590302257	book	https://www.goodreads.com/author/show/1771.Sun...	-512	/genres/non-fiction\|/genres/politics\|/genres/c...	dir07/10534.The_Art_of_War.html	114619	The Art of War	Sun_Tzu
746	4.06	1087	0140449183	book	https://www.goodreads.com/author/show/5158478....	-500	/genres/classics\|/genres/spirituality\|/genres/...	dir08/99944.The_Bhagavad_Gita.html	31634	The Bhagavad Gita	Anonymous
777	3.52	1038	1580493882	book	https://www.goodreads.com/author/show/1002.Sop...	-442	/genres/drama\|/genres/fiction\|/genres/classics...	dir08/7728.Antigone.html	49084	Antigone	Sophocles
1233	3.94	704	015602764X	book	https://www.goodreads.com/author/show/1002.Sop...	-400	/genres/classics\|/genres/plays\|/genres/drama\|/...	dir13/1540.The_Oedipus_Cycle.html	36008	The Oedipus Cycle	Sophocles
1397	4.03	890	0192840509	book	https://www.goodreads.com/author/show/12452.Aesop	-560	/genres/classics\|/genres/childrens\|/genres/lit...	dir14/21348.Aesop_s_Fables.html	71259	Aesop's Fables	Aesop
1398	3.60	1644	0141026286	book	https://www.goodreads.com/author/show/5158478....	-1500	/genres/religion\|/genres/literature\|/genres/an...	dir14/19351.The_Epic_of_Gilgamesh.html	42026	The Epic of Gilgamesh	Anonymous
1428	3.80	539	0486275485	book	https://www.goodreads.com/author/show/973.Euri...	-431	/genres/classics\|/genres/plays\|/genres/drama\|/...	dir15/752900.Medea.html	29858	Medea	Euripides
1815	3.96	493	0140443339	book	https://www.goodreads.com/author/show/990.Aesc...	-458	/genres/classics\|/genres/plays\|/genres/drama\|/...	dir19/1519.The_Oresteia.html	18729	The Oresteia	Aeschylus
1882	4.02	377	0872205541	book	https://www.goodreads.com/author/show/879.Plato	-400	/genres/philosophy\|/genres/classics\|/genres/no...	dir19/22632.The_Trial_and_Death_of_Socrates.html	18712	The Trial and Death of Socrates	Plato
2078	3.84	399	0140440399	book	https://www.goodreads.com/author/show/957.Thuc...	-411	/genres/history\|/genres/classics\|/genres/non-f...	dir21/261243.The_History_of_the_Peloponnesian_...	17212	The History of the Peloponnesian War	Thucydides
2527	3.94	506	0140449086	book	https://www.goodreads.com/author/show/901.Hero...	-440	/genres/history\|/genres/classics\|/genres/non-f...	dir26/1362.The_Histories.html	20570	The Histories	Herodotus
3133	4.30	131	0872203492	book	https://www.goodreads.com/author/show/879.Plato	-400	/genres/philosophy\|/genres/classics\|/genres/no...	dir32/9462.Complete_Works.html	7454	Complete Works	Plato
3274	3.88	411	0140449493	book	https://www.goodreads.com/author/show/2192.Ari...	-350	/genres/philosophy\|/genres/classics\|/genres/no...	dir33/19068.The_Nicomachean_Ethics.html	16534	The Nicomachean Ethics	Aristotle
3757	3.82	364	0872206033	book	https://www.goodreads.com/author/show/1011.Ari...	-411	/genres/plays\|/genres/classics\|/genres/drama\|/...	dir38/1591.Lysistrata.html	18070	Lysistrata	Aristophanes
4402	3.99	516	0140449272	book	https://www.goodreads.com/author/show/879.Plato	-370	/genres/non-fiction\|/genres/classics\|/genres/p...	dir45/81779.The_Symposium.html	18457	The Symposium	Plato
4475	4.11	281	0865163480	book	https://www.goodreads.com/author/show/879.Plato	-390	/genres/philosophy\|/genres/classics\|/genres/no...	dir45/73945.Apology.html	11478	Apology	Plato
5367	4.07	133	0872206335	book	https://www.goodreads.com/author/show/879.Plato	-360	/genres/philosophy\|/genres/classics\|/genres/no...	dir54/30292.Five_Dialogues.html	9964	Five Dialogues	Plato