Title: What Factors Determine the Popularity of Computer Games?

Abstract: For our project, we were interested to see whether certain computer games metrics were related. Specifically, we wanted to see if any correlations existed between a game’s price, rating, total sales, and how much time the average user spends playing that game.

Packages Imported: We imported several packages, including: (1) pandas – an open source software library used for data manipulation and analysis within Python; (2) matplotlib – a plotting library for Python that allowed us to generate our graphs; and (3) Beautiful Soup – a Python library for parsing HTML documents and extracting data (i.e. web scraping). Beautiful Soup is what enabled us to gather data from each game’s specific URL within each of our data sources listed.

Data Sources: From metacritic.com, a website that aggregates user and professional critic reviews of computer games, we pulled (1) scores (i.e. user ratings) for 251 computer games we selected at random from the Metacritic database. We then gathered information on each game’s: (2) number of owners (i.e. copies sold), (3) price, and (4) total average playtime from steamspy.com, a stats service that automatically gathers and organizes data from Steam, a digital software platform used to distribute games and related media online.



In [43]:

    
from bs4 import BeautifulSoup
import urllib
import pandas as pd
import matplotlib as mpl               # graphics package
import matplotlib.pyplot as plt         # pyplot module
from urllib.request import Request, urlopen
%matplotlib inline



In [44]:

    
df = pd.read_csv('Steam_DB_Info_Appsv2.csv',encoding='iso-8859-1',error_bad_lines=False)
#df.columns.append('Score')
dfname = df['NAME']
for n in range(len(dfname)):
    game_name = dfname[n]  ##just for example
    game_name=game_name.replace(' ','%20')
df









    Out[44]:






  
    
      
      APPID
      NAME
      SCORE
    
  
  
    
      0
      362300
      H1Z1: Just Survive Test Server
      NaN
    
    
      1
      340820
      Yelaxot
      NaN
    
    
      2
      444020
      Coffee Pot Terrarium
      NaN
    
    
      3
      463290
      Kismet
      NaN
    
    
      4
      359400
      Why Am I Dead At Sea
      NaN
    
    
      5
      439700
      H1Z1: King of the Kill Test Server
      NaN
    
    
      6
      387240
      Steel Rain
      NaN
    
    
      7
      407060
      AltspaceVR
      NaN
    
    
      8
      362620
      Software Inc.
      NaN
    
    
      9
      448300
      Logic Missile
      NaN
    
    
      10
      448050
      Mr. Nibbles Forever
      NaN
    
    
      11
      401510
      Lavapools
      NaN
    
    
      12
      396700
      Hover Cubes: Arena
      NaN
    
    
      13
      448020
      Share
      NaN
    
    
      14
      456810
      Zombillie
      NaN
    
    
      15
      461970
      Skyreach
      NaN
    
    
      16
      447960
      XCavalypse
      NaN
    
    
      17
      399600
      I and Me
      NaN
    
    
      18
      463050
      The Pit And The Pendulum
      NaN
    
    
      19
      342150
      Games of Glory
      NaN
    
    
      20
      220200
      Kerbal Space Program
      NaN
    
    
      21
      412310
      Labyrinth
      NaN
    
    
      22
      416210
      Block'hood
      NaN
    
    
      23
      262590
      Chucks Challenge 3D
      NaN
    
    
      24
      407530
      ARK: Survival Of The Fittest
      NaN
    
    
      25
      448110
      So Long Earth
      NaN
    
    
      26
      427910
      Zavix Tower
      NaN
    
    
      27
      382850
      Piercing Blow
      NaN
    
    
      28
      325930
      Viking Squad
      NaN
    
    
      29
      398180
      Euclidean
      NaN
    
    
      ...
      ...
      ...
      ...
    
    
      221
      400020
      Atari Vault
      NaN
    
    
      222
      385250
      Paint it Back
      NaN
    
    
      223
      322780
      Worlds Adrift
      NaN
    
    
      224
      404640
      Game Tycoon 2
      NaN
    
    
      225
      428540
      Fragments of Him
      NaN
    
    
      226
      432870
      Special Tactics
      NaN
    
    
      227
      393380
      Squad
      NaN
    
    
      228
      412450
      The Black Death
      NaN
    
    
      229
      440
      Team Fortress 2
      NaN
    
    
      230
      463320
      The Sea Eternal
      NaN
    
    
      231
      282660
      EasyAntiCheat eSports
      NaN
    
    
      232
      351100
      Niffelheim
      NaN
    
    
      233
      386080
      Pharaonic
      NaN
    
    
      234
      423620
      Hero Defense - Haunted Island
      NaN
    
    
      235
      393430
      Ice Lakes
      NaN
    
    
      236
      443700
      The Inner Sea
      NaN
    
    
      237
      239490
      America's Army: Proving Grounds Beta (Closed)
      NaN
    
    
      238
      454150
      Warbit
      NaN
    
    
      239
      444790
      Abandoned Knight
      NaN
    
    
      240
      432980
      INVERSUS
      NaN
    
    
      241
      276810
      Mordheim: City of the Damned
      NaN
    
    
      242
      334230
      Town of Salem
      NaN
    
    
      243
      335950
      Insecters War
      NaN
    
    
      244
      1930
      Two Worlds: Epic Edition
      NaN
    
    
      245
      299360
      Block N Load
      NaN
    
    
      246
      384960
      Police Infinity
      NaN
    
    
      247
      365320
      Quadrant
      NaN
    
    
      248
      233860
      Kenshi
      NaN
    
    
      249
      436410
      Beyond The Destiny
      NaN
    
    
      250
      346900
      AdVenture Capitalist
      NaN
    
  

251 rows × 3 columns

Creating the Data Set: In order to complete this project, we had complete a significant amount of webscraping. Using the data we scraped from metacritic.com and steamspy.com, we created a comprehensive dataframe that consisted of each game and its score, number of owners, price, and total playtime. To do this, we wrote code so that it would scroll through our list of games, which was contained in a .csv that we imported, and scrape data from each game’s unique URLs on both sites (e.g. 'http://www.metacritic.com/search/all/'+game_name+'/results'). One early issue we had was that some of the games in our .csv file contained unknown titles and the trademark symbol next to them, so we had to remove those so as not to cause issues in Python.

Below is the code we used to complete the webscraping, however we have included it in a text cell because of the tremendous amount of time it takes to scrape the data and build the dataframe. In short, Our code was built to loop through the list of games for one metric at a time, and after each loop was completed, that metric would be appended as part of a new column of the dataframe in the corresponding row location related to the data being scraped. One obstacle we confronted was that some games did not have a metacritic user rating, so we included language in our code that would give us a “NAN” (i.e. not a number) output if no rating was available. Additionally, the number of copies sold was returned as a range (e.g. ‘: 69,004,323 ± 176,095’). Thus, we wrote language into our code that would remove everything after and including the “±” symbol. Similarly, the number of digits returned in the “copies sold” metric was varied (i.e. some games sold in the hundreds of thousands, while other games sold in the tens of millions); so again, we added code that would find “spaces” in the number of games sold and return the number in the correct format, regardless of length.

The "df" referred to in the cell below and from hereon out is the dataframe that was saved as a .csv on our desktop.

url_to_scrape = 'http://www.metacritic.com/search/all/'+game_name+'/results' try: req = Request(url_to_scrape, headers={'User-Agent': 'Mozilla/5.0'}) webpage = urlopen(req).read() except Exception as e: print ('something went wrong') print (e) print (url_to_scrape) html = BeautifulSoup(webpage, 'html.parser') all_spans = html.find_all('span') for span in all_spans: if span.get('class') is not None and 'metascore_w' in span.get('class'):

        #print (span)
        score = int(span.text)
        print (score)
        df.loc[(df.NAME== dfname[n]) ,'SCORE'] = score
        break

for n in range(len(dfname)): game_name = dfname[n] ##just for example game_name=game_name.replace(' ','+') url_to_scrape = 'http://steamspy.com/search.php?s='+game_name+'' try: req = Request(url_to_scrape, headers={'User-Agent': 'Mozilla/5.0'}) webpage = urlopen(req).read() except Exception as e: print ('something went wrong') print (e) print (url_to_scrape) html = BeautifulSoup(webpage, 'html.parser') all_strong = html.find_all('strong') for strongowners in all_strong: if strongowners.text == 'Owners': print(strongowners.next.next) ownersnum = strongowners.next.next ownersnum = ownersnum.replace(': ','') ownersnum = ownersnum.replace(',','') ownersnum, sep, tail = ownersnum.partition(' ±') ownersnum = int(ownersnum) break df.loc[(df.NAME== dfname[n]) ,'# of Owners'] = ownersnum

for n in range(len(dfname)): game_name = dfname[n] ##just for example game_name=game_name.replace(' ','+') url_to_scrape = 'http://steamspy.com/search.php?s='+game_name+'' try: req = Request(url_to_scrape, headers={'User-Agent': 'Mozilla/5.0'}) webpage = urlopen(req).read() except Exception as e: print ('something went wrong') print (e) print (url_to_scrape) html = BeautifulSoup(webpage, 'html.parser') all_strong = html.find_all('strong') for strongprice in all_strong: if strongprice.text == 'Price:': print(strongprice.next.next) price = strongprice.next.next price = price.replace('$','') price = float(price) break df.loc[(df.NAME== dfname[n]) ,'Price USD'] = price

for n in range(len(dfapp)): game_name = dfapp[n] ##just for example game_name=game_name.replace(' ','+') url_to_scrape = 'http://steamspy.com/app/'+str(game_name) try: req = Request(url_to_scrape, headers={'User-Agent': 'Mozilla/5.0'}) webpage = urlopen(req).read() except Exception as e: print ('something went wrong') print (e) print (url_to_scrape) html = BeautifulSoup(webpage, 'html.parser') all_strong = html.find_all('strong')

#print (all_strong)
play=0
for strongplay in all_strong:
    #print (strongplay)
    if 'Playtime' in strongplay.text :
        found =True
        print(strongplay.next.next)
        play = strongplay.next.next
        play = play.replace(':','.')
        play, sep, tail = play.partition(' (average)')
        play = float(play)
        break
df.loc[(df.NAME== dfname[n]) ,'Total Playtime (hrs)'] = play



In [45]:

    
df = pd.read_csv('/Users/Jeff/Desktop/ Spring 2016 /Data Bootcamp/Mandelup-Soffer-Data-1.2.csv',encoding='iso-8859-1',error_bad_lines=False)
df = df.rename(columns = {'Unnamed: 0':'Unnamed', 'APPID': 'AppID', 'NAME': 'Name', 'SCORE':'Score', 
                            'Total Playtime (hrs)': 'Total_Playtime_hrs', '# of Owners': 'Number_of_Owners', 'Price USD': 'Price_USD'})

df = df.set_index(['Name'])
df_2 = df.drop(['Unnamed', 'AppID'], axis = 1) 
# df_2 = df.drop(['Unnamed: 0', 'APPID', 'SCORE', 'Total Playtime (hrs)', '# of Owners', 'Price USD'], axis = 1) 
# for df_2 where we'll be graphing, drop all but two of the variables to generate the desired graph

df









    Out[45]:






  
    
      
      Unnamed
      AppID
      Score
      Total_Playtime_hrs
      Number_of_Owners
      Price_USD
    
    
      Name
      
      
      
      
      
      
    
  
  
    
      H1Z1: Just Survive Test Server
      0
      362300
      NaN
      1.10
      396249
      19.99
    
    
      Yelaxot
      1
      340820
      NaN
      2.19
      7599
      11.99
    
    
      Coffee Pot Terrarium
      2
      444020
      NaN
      0.00
      479
      14.99
    
    
      Kismet
      3
      463290
      NaN
      0.00
      473
      14.99
    
    
      Why Am I Dead At Sea
      4
      359400
      NaN
      5.37
      2216
      9.99
    
    
      H1Z1: King of the Kill Test Server
      5
      439700
      NaN
      0.49
      2541976
      19.99
    
    
      Steel Rain
      6
      387240
      NaN
      2.55
      10765
      9.99
    
    
      AltspaceVR
      7
      407060
      NaN
      0.41
      14564
      9.99
    
    
      Software Inc.
      8
      362620
      NaN
      6.37
      38628
      13.99
    
    
      Logic Missile
      9
      448300
      NaN
      0.00
      479
      0.99
    
    
      Mr. Nibbles Forever
      10
      448050
      NaN
      0.00
      792
      2.54
    
    
      Lavapools
      11
      401510
      NaN
      0.00
      473
      2.54
    
    
      Hover Cubes: Arena
      12
      396700
      NaN
      0.00
      301105
      19.99
    
    
      Share
      13
      448020
      59
      0.00
      105592
      19.99
    
    
      Zombillie
      14
      456810
      NaN
      2.09
      479
      2.99
    
    
      Skyreach
      15
      461970
      NaN
      0.05
      6186
      2.99
    
    
      XCavalypse
      16
      447960
      NaN
      0.00
      6186
      2.99
    
    
      I and Me
      17
      399600
      93
      0.00
      912179
      9.99
    
    
      The Pit And The Pendulum
      18
      463050
      NaN
      0.00
      479
      4.99
    
    
      Games of Glory
      19
      342150
      NaN
      0.20
      137254
      9.99
    
    
      Kerbal Space Program
      20
      220200
      88
      0.00
      74089
      9.99
    
    
      Labyrinth
      21
      412310
      98
      2.48
      4433
      5.99
    
    
      Block'hood
      22
      416210
      NaN
      1.08
      100527
      7.99
    
    
      Chucks Challenge 3D
      23
      262590
      NaN
      2.27
      20422
      3.99
    
    
      ARK: Survival Of The Fittest
      24
      407530
      NaN
      2.25
      1571855
      49.99
    
    
      So Long Earth
      25
      448110
      NaN
      0.00
      181581
      9.99
    
    
      Zavix Tower
      26
      427910
      NaN
      0.41
      27546
      9.99
    
    
      Piercing Blow
      27
      382850
      NaN
      5.39
      333716
      9.99
    
    
      Viking Squad
      28
      325930
      NaN
      0.00
      469704
      9.99
    
    
      Euclidean
      29
      398180
      65
      0.53
      7124
      4.99
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      Atari Vault
      221
      400020
      NaN
      0.24
      9182
      19.99
    
    
      Paint it Back
      222
      385250
      NaN
      5.04
      4908
      5.99
    
    
      Worlds Adrift
      223
      322780
      NaN
      2.07
      952
      9.99
    
    
      Game Tycoon 2
      224
      404640
      NaN
      0.46
      4274
      14.99
    
    
      Fragments of Him
      225
      428540
      NaN
      0.05
      476
      9.99
    
    
      Special Tactics
      226
      432870
      NaN
      1.02
      1425
      8.99
    
    
      Squad
      227
      393380
      NaN
      3.49
      469704
      9.99
    
    
      The Black Death
      228
      412450
      NaN
      2.25
      8549
      19.99
    
    
      Team Fortress 2
      229
      440
      NaN
      9.07
      33041097
      4.99
    
    
      The Sea Eternal
      230
      463320
      NaN
      0.58
      714
      3.99
    
    
      EasyAntiCheat eSports
      231
      282660
      NaN
      8.49
      259311
      3.99
    
    
      Niffelheim
      232
      351100
      NaN
      3.56
      2216
      17.99
    
    
      Pharaonic
      233
      386080
      NaN
      1.23
      1190
      13.59
    
    
      Hero Defense - Haunted Island
      234
      423620
      NaN
      0.43
      7599
      14.99
    
    
      Ice Lakes
      235
      393430
      NaN
      3.09
      1428
      14.99
    
    
      The Inner Sea
      236
      443700
      NaN
      0.36
      714
      11.99
    
    
      America's Army: Proving Grounds Beta (Closed)
      237
      239490
      NaN
      4.03
      167175
      14.99
    
    
      Warbit
      238
      454150
      NaN
      0.12
      5541
      0.99
    
    
      Abandoned Knight
      239
      444790
      NaN
      0.00
      958
      1.99
    
    
      INVERSUS
      240
      432980
      NaN
      0.00
      958
      1.99
    
    
      Mordheim: City of the Damned
      241
      276810
      NaN
      17.23
      107334
      39.99
    
    
      Town of Salem
      242
      334230
      NaN
      5.50
      243163
      4.99
    
    
      Insecters War
      243
      335950
      NaN
      3.41
      476
      5.99
    
    
      Two Worlds: Epic Edition
      244
      1930
      NaN
      3.34
      825426
      9.99
    
    
      Block N Load
      245
      299360
      NaN
      2.55
      1319510
      4.99
    
    
      Police Infinity
      246
      384960
      NaN
      0.00
      479
      14.99
    
    
      Quadrant
      247
      365320
      NaN
      0.41
      792
      3.99
    
    
      Kenshi
      248
      233860
      NaN
      4.40
      115724
      19.99
    
    
      Beyond The Destiny
      249
      436410
      NaN
      0.00
      216251
      11.99
    
    
      AdVenture Capitalist
      250
      346900
      NaN
      8.55
      3320226
      4.99
    
  

251 rows × 6 columns

We then wrote code that enabled us to group our games into “Price Buckets,” namely, those games whose price was: (a) x < 5; (b) 5 < x < 15; (c) 15 < x < 30; and (d) 30 < x. This provided us with a useful means for organizing 251 individual games into 4 groups based on a common variable. These buckets were determined based on industry knowledge that these are typical cutoff points for market segmentation.



In [46]:

    
between_5_and_15 = df_2['Price_USD']<15
df_2.ix[between_5_and_15,'Price_Bucket']="between_5_and_15"

less_than_5 = df_2['Price_USD']<5
df_2.ix[less_than_5,'Price_Bucket']="less_than_5"

between_15_and_30 = 15<df_2['Price_USD']
df_2.ix[between_15_and_30,'Price_Bucket']="between_15_and_30"

greater_than_30 = 30<df_2['Price_USD']
df_2.ix[greater_than_30,'Price_Bucket']="greater_than_30"

df_2









    Out[46]:






  
    
      
      Score
      Total_Playtime_hrs
      Number_of_Owners
      Price_USD
      Price_Bucket
    
    
      Name
      
      
      
      
      
    
  
  
    
      H1Z1: Just Survive Test Server
      NaN
      1.10
      396249
      19.99
      between_15_and_30
    
    
      Yelaxot
      NaN
      2.19
      7599
      11.99
      between_5_and_15
    
    
      Coffee Pot Terrarium
      NaN
      0.00
      479
      14.99
      between_5_and_15
    
    
      Kismet
      NaN
      0.00
      473
      14.99
      between_5_and_15
    
    
      Why Am I Dead At Sea
      NaN
      5.37
      2216
      9.99
      between_5_and_15
    
    
      H1Z1: King of the Kill Test Server
      NaN
      0.49
      2541976
      19.99
      between_15_and_30
    
    
      Steel Rain
      NaN
      2.55
      10765
      9.99
      between_5_and_15
    
    
      AltspaceVR
      NaN
      0.41
      14564
      9.99
      between_5_and_15
    
    
      Software Inc.
      NaN
      6.37
      38628
      13.99
      between_5_and_15
    
    
      Logic Missile
      NaN
      0.00
      479
      0.99
      less_than_5
    
    
      Mr. Nibbles Forever
      NaN
      0.00
      792
      2.54
      less_than_5
    
    
      Lavapools
      NaN
      0.00
      473
      2.54
      less_than_5
    
    
      Hover Cubes: Arena
      NaN
      0.00
      301105
      19.99
      between_15_and_30
    
    
      Share
      59
      0.00
      105592
      19.99
      between_15_and_30
    
    
      Zombillie
      NaN
      2.09
      479
      2.99
      less_than_5
    
    
      Skyreach
      NaN
      0.05
      6186
      2.99
      less_than_5
    
    
      XCavalypse
      NaN
      0.00
      6186
      2.99
      less_than_5
    
    
      I and Me
      93
      0.00
      912179
      9.99
      between_5_and_15
    
    
      The Pit And The Pendulum
      NaN
      0.00
      479
      4.99
      less_than_5
    
    
      Games of Glory
      NaN
      0.20
      137254
      9.99
      between_5_and_15
    
    
      Kerbal Space Program
      88
      0.00
      74089
      9.99
      between_5_and_15
    
    
      Labyrinth
      98
      2.48
      4433
      5.99
      between_5_and_15
    
    
      Block'hood
      NaN
      1.08
      100527
      7.99
      between_5_and_15
    
    
      Chucks Challenge 3D
      NaN
      2.27
      20422
      3.99
      less_than_5
    
    
      ARK: Survival Of The Fittest
      NaN
      2.25
      1571855
      49.99
      greater_than_30
    
    
      So Long Earth
      NaN
      0.00
      181581
      9.99
      between_5_and_15
    
    
      Zavix Tower
      NaN
      0.41
      27546
      9.99
      between_5_and_15
    
    
      Piercing Blow
      NaN
      5.39
      333716
      9.99
      between_5_and_15
    
    
      Viking Squad
      NaN
      0.00
      469704
      9.99
      between_5_and_15
    
    
      Euclidean
      65
      0.53
      7124
      4.99
      less_than_5
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      Atari Vault
      NaN
      0.24
      9182
      19.99
      between_15_and_30
    
    
      Paint it Back
      NaN
      5.04
      4908
      5.99
      between_5_and_15
    
    
      Worlds Adrift
      NaN
      2.07
      952
      9.99
      between_5_and_15
    
    
      Game Tycoon 2
      NaN
      0.46
      4274
      14.99
      between_5_and_15
    
    
      Fragments of Him
      NaN
      0.05
      476
      9.99
      between_5_and_15
    
    
      Special Tactics
      NaN
      1.02
      1425
      8.99
      between_5_and_15
    
    
      Squad
      NaN
      3.49
      469704
      9.99
      between_5_and_15
    
    
      The Black Death
      NaN
      2.25
      8549
      19.99
      between_15_and_30
    
    
      Team Fortress 2
      NaN
      9.07
      33041097
      4.99
      less_than_5
    
    
      The Sea Eternal
      NaN
      0.58
      714
      3.99
      less_than_5
    
    
      EasyAntiCheat eSports
      NaN
      8.49
      259311
      3.99
      less_than_5
    
    
      Niffelheim
      NaN
      3.56
      2216
      17.99
      between_15_and_30
    
    
      Pharaonic
      NaN
      1.23
      1190
      13.59
      between_5_and_15
    
    
      Hero Defense - Haunted Island
      NaN
      0.43
      7599
      14.99
      between_5_and_15
    
    
      Ice Lakes
      NaN
      3.09
      1428
      14.99
      between_5_and_15
    
    
      The Inner Sea
      NaN
      0.36
      714
      11.99
      between_5_and_15
    
    
      America's Army: Proving Grounds Beta (Closed)
      NaN
      4.03
      167175
      14.99
      between_5_and_15
    
    
      Warbit
      NaN
      0.12
      5541
      0.99
      less_than_5
    
    
      Abandoned Knight
      NaN
      0.00
      958
      1.99
      less_than_5
    
    
      INVERSUS
      NaN
      0.00
      958
      1.99
      less_than_5
    
    
      Mordheim: City of the Damned
      NaN
      17.23
      107334
      39.99
      greater_than_30
    
    
      Town of Salem
      NaN
      5.50
      243163
      4.99
      less_than_5
    
    
      Insecters War
      NaN
      3.41
      476
      5.99
      between_5_and_15
    
    
      Two Worlds: Epic Edition
      NaN
      3.34
      825426
      9.99
      between_5_and_15
    
    
      Block N Load
      NaN
      2.55
      1319510
      4.99
      less_than_5
    
    
      Police Infinity
      NaN
      0.00
      479
      14.99
      between_5_and_15
    
    
      Quadrant
      NaN
      0.41
      792
      3.99
      less_than_5
    
    
      Kenshi
      NaN
      4.40
      115724
      19.99
      between_15_and_30
    
    
      Beyond The Destiny
      NaN
      0.00
      216251
      11.99
      between_5_and_15
    
    
      AdVenture Capitalist
      NaN
      8.55
      3320226
      4.99
      less_than_5
    
  

251 rows × 5 columns

In the cell below, we (1) set Price_Bucket as the index for our dataframe; (2) reorder the index so that each Price_Bucket is listed in ascending order; and (3) created a new dataframe (means_df) that consisted of the average Score, Total_Playtime_hrs, Number_of_Owners, and Price_USD for each Price_Bucket.



In [47]:

    
df_2_reset = df_2.reset_index()
df_2 = df_2.set_index(["Price_Bucket"])
df_2.index = pd.CategoricalIndex(df_2.index,['less_than_5', 'between_5_and_15', 'between_15_and_30', 'greater_than_30'])
df_2 = df_2.sort_index()
means_df = df_2.groupby(df_2.index).mean()
means_df









    Out[47]:






  
    
      
      Score
      Total_Playtime_hrs
      Number_of_Owners
      Price_USD
    
  
  
    
      less_than_5
      65.000000
      3.546607
      2006759
      3.684643
    
    
      between_5_and_15
      76.520000
      1.766400
      287030
      10.982000
    
    
      between_15_and_30
      71.944444
      4.412931
      668213
      21.964310
    
    
      greater_than_30
      77.750000
      8.627500
      463793
      59.573333



In [48]:

    
means_df.Score.plot(kind = 'line', title = "Average Score")









    Out[48]:





<matplotlib.axes._subplots.AxesSubplot at 0x10f9e13c8>



In [49]:

    
means_df.Total_Playtime_hrs.plot(kind='barh', title = "Average Playtime (in Hours)")









    Out[49]:





<matplotlib.axes._subplots.AxesSubplot at 0x10fbb2ba8>



In [50]:

    
means_df.Number_of_Owners.plot(kind='bar', title = "Average Number of Copies Sold")









    Out[50]:





<matplotlib.axes._subplots.AxesSubplot at 0x10feaee10>

Conclusions:

First, it is important to note that our Price Buckets were different sizes: 56 games cost less than 5; 125 cost between 5 and 15; 58 cost between 15 and 30; and 12 games cost more than 30. In other words, about 72% of games included in our analysis were under 15.

That said, we found that there was indeed a correlation between price and score; specifically, the games that were most expensive (i.e. greater than 30) had the highest mean score. Similarly, the games in the least expensive bucket (i.e. less than 5) had the lowest mean score. We were surprised to discover, however, that this was not wholly linear, as the games that cost between 5 and 15 had a higher mean score than those that cost between 15 and 30. Further analysis showed that this could be explained by the presence of a seperate pricing model for lower cost games whereby additional content was sold piecemeal within the game. The lower initial entry price is therefore to increase copies sold (exposure) and utilize a network effect while purchasing all of the content would place the game closer or above the $30+ price range.

Those games that fell in the most expensive price bucket also had the highest average total playtime (8.63 hours), which was almost double the next highest group (4.41 hours), the 15 < x < 30 price bucket. However, we are unable to determine from the data whether the most expensive games are being played for more hours than other games because they are better games, or whether users feel obligated to spend more time playing those games because they invested more money in them. We would presume the former, but the data is not conclusive. We also noticed a bump in games below the 5 dollar price range. We believe this is because it allows access to a new segment of users with different values than those which participate in the $5-30+ range that are less motivated by product quality or as mentioned earlier find value in the product and contribute to the products success via network effect but cannot afford to spend more than the initial buy-in.

Finally, the average number of copies sold for games that were under 5 was two million, while none of the other price buckets had a mean number of copies sold that was higher than 700,000. Thus, we were able to conclude that if a game cost less than $5 it would sell many copies, but was not any more likely to sell more copies priced at $10 than $30+. Therefore, while prices between 5 and 30 are indicative of lower quality based on our data, games priced below 5 dollars have a much wider variety of quality because they rely on in-applicationo purchases and network effect to drive revenue and therefore the initial buy-in price is not indiciative of the value of the product.



In [ ]:

	APPID	NAME	SCORE
0	362300	H1Z1: Just Survive Test Server	NaN
1	340820	Yelaxot	NaN
2	444020	Coffee Pot Terrarium	NaN
3	463290	Kismet	NaN
4	359400	Why Am I Dead At Sea	NaN
5	439700	H1Z1: King of the Kill Test Server	NaN
6	387240	Steel Rain	NaN
7	407060	AltspaceVR	NaN
8	362620	Software Inc.	NaN
9	448300	Logic Missile	NaN
10	448050	Mr. Nibbles Forever	NaN
11	401510	Lavapools	NaN
12	396700	Hover Cubes: Arena	NaN
13	448020	Share	NaN
14	456810	Zombillie	NaN
15	461970	Skyreach	NaN
16	447960	XCavalypse	NaN
17	399600	I and Me	NaN
18	463050	The Pit And The Pendulum	NaN
19	342150	Games of Glory	NaN
20	220200	Kerbal Space Program	NaN
21	412310	Labyrinth	NaN
22	416210	Block'hood	NaN
23	262590	Chucks Challenge 3D	NaN
24	407530	ARK: Survival Of The Fittest	NaN
25	448110	So Long Earth	NaN
26	427910	Zavix Tower	NaN
27	382850	Piercing Blow	NaN
28	325930	Viking Squad	NaN
29	398180	Euclidean	NaN
...	...	...	...
221	400020	Atari Vault	NaN
222	385250	Paint it Back	NaN
223	322780	Worlds Adrift	NaN
224	404640	Game Tycoon 2	NaN
225	428540	Fragments of Him	NaN
226	432870	Special Tactics	NaN
227	393380	Squad	NaN
228	412450	The Black Death	NaN
229	440	Team Fortress 2	NaN
230	463320	The Sea Eternal	NaN
231	282660	EasyAntiCheat eSports	NaN
232	351100	Niffelheim	NaN
233	386080	Pharaonic	NaN
234	423620	Hero Defense - Haunted Island	NaN
235	393430	Ice Lakes	NaN
236	443700	The Inner Sea	NaN
237	239490	America's Army: Proving Grounds Beta (Closed)	NaN
238	454150	Warbit	NaN
239	444790	Abandoned Knight	NaN
240	432980	INVERSUS	NaN
241	276810	Mordheim: City of the Damned	NaN
242	334230	Town of Salem	NaN
243	335950	Insecters War	NaN
244	1930	Two Worlds: Epic Edition	NaN
245	299360	Block N Load	NaN
246	384960	Police Infinity	NaN
247	365320	Quadrant	NaN
248	233860	Kenshi	NaN
249	436410	Beyond The Destiny	NaN
250	346900	AdVenture Capitalist	NaN

	Unnamed	AppID	Score	Total_Playtime_hrs	Number_of_Owners	Price_USD
Name
H1Z1: Just Survive Test Server	0	362300	NaN	1.10	396249	19.99
Yelaxot	1	340820	NaN	2.19	7599	11.99
Coffee Pot Terrarium	2	444020	NaN	0.00	479	14.99
Kismet	3	463290	NaN	0.00	473	14.99
Why Am I Dead At Sea	4	359400	NaN	5.37	2216	9.99
H1Z1: King of the Kill Test Server	5	439700	NaN	0.49	2541976	19.99
Steel Rain	6	387240	NaN	2.55	10765	9.99
AltspaceVR	7	407060	NaN	0.41	14564	9.99
Software Inc.	8	362620	NaN	6.37	38628	13.99
Logic Missile	9	448300	NaN	0.00	479	0.99
Mr. Nibbles Forever	10	448050	NaN	0.00	792	2.54
Lavapools	11	401510	NaN	0.00	473	2.54
Hover Cubes: Arena	12	396700	NaN	0.00	301105	19.99
Share	13	448020	59	0.00	105592	19.99
Zombillie	14	456810	NaN	2.09	479	2.99
Skyreach	15	461970	NaN	0.05	6186	2.99
XCavalypse	16	447960	NaN	0.00	6186	2.99
I and Me	17	399600	93	0.00	912179	9.99
The Pit And The Pendulum	18	463050	NaN	0.00	479	4.99
Games of Glory	19	342150	NaN	0.20	137254	9.99
Kerbal Space Program	20	220200	88	0.00	74089	9.99
Labyrinth	21	412310	98	2.48	4433	5.99
Block'hood	22	416210	NaN	1.08	100527	7.99
Chucks Challenge 3D	23	262590	NaN	2.27	20422	3.99
ARK: Survival Of The Fittest	24	407530	NaN	2.25	1571855	49.99
So Long Earth	25	448110	NaN	0.00	181581	9.99
Zavix Tower	26	427910	NaN	0.41	27546	9.99
Piercing Blow	27	382850	NaN	5.39	333716	9.99
Viking Squad	28	325930	NaN	0.00	469704	9.99
Euclidean	29	398180	65	0.53	7124	4.99
...	...	...	...	...	...	...
Atari Vault	221	400020	NaN	0.24	9182	19.99
Paint it Back	222	385250	NaN	5.04	4908	5.99
Worlds Adrift	223	322780	NaN	2.07	952	9.99
Game Tycoon 2	224	404640	NaN	0.46	4274	14.99
Fragments of Him	225	428540	NaN	0.05	476	9.99
Special Tactics	226	432870	NaN	1.02	1425	8.99
Squad	227	393380	NaN	3.49	469704	9.99
The Black Death	228	412450	NaN	2.25	8549	19.99
Team Fortress 2	229	440	NaN	9.07	33041097	4.99
The Sea Eternal	230	463320	NaN	0.58	714	3.99
EasyAntiCheat eSports	231	282660	NaN	8.49	259311	3.99
Niffelheim	232	351100	NaN	3.56	2216	17.99
Pharaonic	233	386080	NaN	1.23	1190	13.59
Hero Defense - Haunted Island	234	423620	NaN	0.43	7599	14.99
Ice Lakes	235	393430	NaN	3.09	1428	14.99
The Inner Sea	236	443700	NaN	0.36	714	11.99
America's Army: Proving Grounds Beta (Closed)	237	239490	NaN	4.03	167175	14.99
Warbit	238	454150	NaN	0.12	5541	0.99
Abandoned Knight	239	444790	NaN	0.00	958	1.99
INVERSUS	240	432980	NaN	0.00	958	1.99
Mordheim: City of the Damned	241	276810	NaN	17.23	107334	39.99
Town of Salem	242	334230	NaN	5.50	243163	4.99
Insecters War	243	335950	NaN	3.41	476	5.99
Two Worlds: Epic Edition	244	1930	NaN	3.34	825426	9.99
Block N Load	245	299360	NaN	2.55	1319510	4.99
Police Infinity	246	384960	NaN	0.00	479	14.99
Quadrant	247	365320	NaN	0.41	792	3.99
Kenshi	248	233860	NaN	4.40	115724	19.99
Beyond The Destiny	249	436410	NaN	0.00	216251	11.99
AdVenture Capitalist	250	346900	NaN	8.55	3320226	4.99

	Score	Total_Playtime_hrs	Number_of_Owners	Price_USD
less_than_5	65.000000	3.546607	2006759	3.684643
between_5_and_15	76.520000	1.766400	287030	10.982000
between_15_and_30	71.944444	4.412931	668213	21.964310
greater_than_30	77.750000	8.627500	463793	59.573333