Analyzing shopstyle ties

Here I show you the code for setting up a Pandas dataframe by connecting to the shopstyle API.

To use the Shopsense API, you need to first sign up to get an API Key http://shopsense.shopstyle.com/landing.

To understand more about the structure of their dataoutput, take a look at their documentation

I saved my API key in mykeys.py


In [1]:
import mykeys

In [2]:
import urllib2
import json

In [3]:
url = "http://api.shopstyle.com/api/v2/"
ties = "{}products?pid={}&cat=mens-ties&limit=100".format(url, mykeys.apiKey)
jsonResponse = urllib2.urlopen(ties)
data = json.load(jsonResponse)

We want some of the information that they give us about how much data is available to collect.


In [4]:
total = data['metadata']['total'] 
limit = data['metadata']['limit'] 
offset = data['metadata']['offset'] 
pages = (total / limit)

print "{} total, {} per page. {} pages to process".format(total, limit, pages)


13131 total, 50 per page. 262 pages to process

In [5]:
import pandas as pd 

tmp = pd.DataFrame(data['products'])
dfs = {}

for page in range(pages+1):
    allTies = "{}products?pid={}&cat=mens-ties&limit=100&offset={}&sort=popular".format(url, mykeys.apiKey, (page*50))
    jsonResponse = urllib2.urlopen(allTies)
    data = json.load(jsonResponse)
    dfs[page] = pd.DataFrame(data['products'])

dfs.keys()  
df = pd.concat(dfs, ignore_index=True)

In [6]:
# Cleaning records, removing duplicates
df = df.drop_duplicates('id')
df['priceLabel'] = df['priceLabel'].str.replace('$', '')
df['priceLabel'] = df['priceLabel'].astype(float)

In [7]:
df.dtypes


Out[7]:
alternateImages         object
badges                  object
brand                   object
brandedName             object
categories              object
clickUrl                object
colors                  object
currency                object
description             object
extractDate             object
id                       int64
image                   object
inStock                   bool
locale                  object
maxPrice               float64
maxPriceLabel           object
maxSalePrice           float64
maxSalePriceLabel       object
name                    object
pageUrl                 object
price                  float64
priceLabel             float64
priceRangeLabel         object
retailer                object
salePrice              float64
salePriceLabel          object
salePriceRangeLabel     object
seeMoreLabel            object
seeMoreUrl              object
sizes                   object
unbrandedName           object
dtype: object

In [8]:
#split brand into 2 columns

def breakId(x,y=0):
    try:
        y = x["id"]
    except:
        pass
    return y

def breakName(x, y=""):
    try:
        y = x["name"]
    except:
        pass
    return y

df['brandId'] = df['brand'].map(breakId);
df['brandName'] = df['brand'].map(breakName);

In [9]:
def breakCanC(x,y=""):
    try:
        y = x[0]["canonicalColors"][0]["name"]
    except:
        pass
    return y

def breakColorName(x, y=""):
    try:
        y = x[0]["name"]
    except:
        pass
    return y

def breakColorId(x, y=""):
    try:
        y = x[0]["canonicalColors"][0]["id"]
    except:
        pass
    return y

df['colorId'] = df['colors'].map(breakColorId);
df['colorFamily'] = df['colors'].map(breakCanC);
df['colorNamed'] = df['colors'].map(breakColorName);

In [10]:
df.head()


Out[10]:
alternateImages badges brand brandedName categories clickUrl colors currency description extractDate ... salePriceRangeLabel seeMoreLabel seeMoreUrl sizes unbrandedName brandId brandName colorId colorFamily colorNamed
0 [{u'id': u'aeb581bbc9337dbdb39125e9a3ca9cd1', ... [] {u'id': u'1041', u'name': u'David Donahue'} David Donahue Woven Silk & Cotton Tie [{u'shortName': u'Ties', u'localizedId': u'men... http://www.shopstyle.com/action/apiVisitRetail... [{u'canonicalColors': [{u'id': u'14', u'name':... USD Sharp floral medallions mark a handsome tie cu... 2014-09-08 ... NaN David Donahue Ties http://www.shopstyle.com/browse/mens-ties/Davi... [{u'name': u'Regular'}] Woven Silk & Cotton Tie 1041 David Donahue 14 Gray Charcoal
1 [{u'id': u'0ee4a1e11421f9383dbff62432555d1c', ... [] {u'id': u'870', u'name': u'Burberry'} Burberry Two-Tone Herringbone Silk Tie [{u'shortName': u'Ties', u'localizedId': u'men... http://www.shopstyle.com/action/apiVisitRetail... [{u'canonicalColors': [{u'id': u'8', u'name': ... USD <ul> <li>A skinny cut hand-finished silk tie w... 2014-08-30 ... NaN Burberry Ties http://www.shopstyle.com/browse/mens-ties/Burb... [] Two-Tone Herringbone Silk Tie 870 Burberry 8 Purple PURPLE
2 [] [] {u'id': u'2861', u'name': u'Drakes'} Drakes Drake's® wool tie [{u'shortName': u'Ties', u'localizedId': u'men... http://www.shopstyle.com/action/apiVisitRetail... [{u'canonicalColors': [{u'id': u'14', u'name':... USD This wool tie was handcrafted by the esteemed ... 2014-08-15 ... NaN Drakes Ties http://www.shopstyle.com/browse/mens-ties/Drak... [{u'name': u'ONE SIZE'}] Drake's® wool tie 2861 Drakes 14 Gray grey
3 [] [] {u'id': u'1948', u'name': u'Salvatore Ferragamo'} Salvatore Ferragamo plum floral and dog printe... [{u'shortName': u'Ties', u'localizedId': u'men... http://www.shopstyle.com/action/apiVisitRetail... [{u'canonicalColors': [{u'id': u'8', u'name': ... USD plum floral and dog printed 'Maltese' silk tie... 2014-09-01 ... NaN Salvatore Ferragamo Ties http://www.shopstyle.com/browse/mens-ties/Salv... [] plum floral and dog printed 'Maltese' silk tie 1948 Salvatore Ferragamo 8 Purple Plum / Coral
4 [{u'id': u'd037b00b5b197a7059d07ec78912b859', ... [] {u'id': u'580', u'name': u'Theory'} Theory Coupe Tie in Fordingbridge [{u'shortName': u'Ties', u'localizedId': u'men... http://www.shopstyle.com/action/apiVisitRetail... [{u'canonicalColors': [{u'id': u'10', u'name':... USD Our new staple for season-round suiting, the C... 2014-08-11 ... NaN Theory Ties http://www.shopstyle.com/browse/mens-ties/Theo... [{u'name': u'1SZ'}] Coupe Tie in Fordingbridge 580 Theory 10 Blue COTE

5 rows × 36 columns

Going to save the pertinent columns to a tab separated values (.tsv) file to make it easier to work with data locally. It'll be quicker than waiting for the connection to API.


In [11]:
df.to_csv("tieColors_cleaned.txt", sep='\t', encoding='utf-8', 
          columns=['id', 'priceLabel', 'name','brandId', 'brandName', 'colorId', 'colorFamily', 'colorNamed'])

Means, Min, Max

alright! Onto the fun stuff ^_^


In [13]:
import pandas

def openWithPandas(filename):
    tdf = pandas.read_table(filename, sep='\t')
    return tdf

In [14]:
df = openWithPandas('tieColors_cleaned.txt')
df.dtypes


Out[14]:
Unnamed: 0       int64
id               int64
priceLabel     float64
name            object
brandId          int64
brandName       object
colorId        float64
colorFamily     object
colorNamed      object
dtype: object

Grouping by color


In [16]:
bycolor = df.groupby('colorFamily')
byColorSummary = bycolor['priceLabel'].describe()
byColorSummary


Out[16]:
colorFamily       
Beige        count     29.000000
             mean      93.919310
             std       42.969727
             min       29.000000
             25%       61.000000
             50%       85.000000
             75%      128.000000
             max      195.000000
Black        count    773.000000
             mean     105.982238
             std       78.755806
             min        7.900000
             25%       55.000000
             50%       75.000000
             75%      150.000000
...
White        mean     130.888947
             std       69.360847
             min       17.000000
             25%       68.875000
             50%      145.000000
             75%      195.000000
             max      275.000000
Yellow       count     53.000000
             mean     152.081132
             std       66.710977
             min       36.500000
             25%       95.000000
             50%      155.000000
             75%      195.000000
             max      295.000000
Length: 112, dtype: float64

In [17]:
%pylab inline

bycolor = df.groupby('colorFamily')
p1 = bycolor['priceLabel'].mean().order()

plot1 = p1.plot(kind='bar',  figsize=(20, 10), title="Average Prices By Color of Ties", color='grey')
plot1.set_ylabel("Average Price ($)")
plot1.set_xlabel("Tie Color")
plt.savefig('colors.png', bbox_inches='tight')


Populating the interactive namespace from numpy and matplotlib

In [18]:
p1


Out[18]:
colorFamily
Gold            85.071852
Beige           93.919310
Silver          95.931364
Black          105.982238
Purple         110.520839
Green          110.581859
Pink           123.400313
Blue           125.310871
White          130.888947
Gray           134.296691
Red            135.302156
Brown          139.275105
Orange         141.330120
Yellow         152.081132
Name: priceLabel, dtype: float64

In [19]:
p2 = bycolor['priceLabel'].agg([np.max, np.mean]).sort('mean')
p2


Out[19]:
amax mean
colorFamily
Gold 235.00 85.071852
Beige 195.00 93.919310
Silver 215.00 95.931364
Black 700.00 105.982238
Purple 295.00 110.520839
Green 295.00 110.581859
Pink 375.00 123.400313
Blue 665.00 125.310871
White 275.00 130.888947
Gray 473.00 134.296691
Red 459.83 135.302156
Brown 665.00 139.275105
Orange 295.00 141.330120
Yellow 295.00 152.081132

In [20]:
plot2 = p2.plot(kind='bar',  figsize=(20, 10), title="Price info By Color of Ties")
plot2.set_ylabel("Price ($)")
plot2.set_xlabel("Tie Color")
plt.savefig('color-stats.png')



In [21]:
p2 = bycolor['priceLabel'].agg(['count', np.mean, np.std, np.min, np.max]).sort('count')
p2


Out[21]:
count mean std amin amax
colorFamily
Silver 22 95.931364 67.318273 15.99 215.00
Gold 27 85.071852 52.375583 15.99 235.00
Beige 29 93.919310 42.969727 29.00 195.00
Yellow 53 152.081132 66.710977 36.50 295.00
White 76 130.888947 69.360847 17.00 275.00
Orange 83 141.330120 85.815344 18.00 295.00
Pink 128 123.400313 71.052435 20.00 375.00
Brown 190 139.275105 84.477037 17.00 665.00
Green 199 110.581859 67.792116 19.00 295.00
Gray 272 134.296691 73.303216 16.00 473.00
Purple 298 110.520839 65.817954 20.00 295.00
Red 501 135.302156 77.169222 3.95 459.83
Black 773 105.982238 78.755806 7.90 700.00
Blue 1826 125.310871 74.663597 8.80 665.00

cluster ties by brands and tally colors as well as describe prices


In [22]:
byBrand = df.groupby('brandName')
bb2 = byBrand['priceLabel'].agg(['count', np.mean, np.min, np.max])

In [23]:
# Brands averaging over $250, sorted by their mean. Sorted descending
p4 = bb2[bb2['mean']>=250].sort('mean', ascending=False)
p4


Out[23]:
count mean amin amax
brandName
Christian Dior 6 414.666667 158.00 665.00
Yohji Yamamoto 5 369.220000 239.72 459.83
Kiton 103 295.679612 295.00 365.00
Stefano Ricci 44 286.590909 215.00 700.00
Brunello Cucinelli 22 260.107273 192.15 295.00
Jupe 1 260.000000 260.00 260.00
Boglioli 1 250.000000 250.00 250.00

In [24]:
p4.plot(kind='bar',  figsize=(12, 10), title="Prices of Luxury Ties")


Out[24]:
<matplotlib.axes._subplots.AxesSubplot at 0x11ee24e50>

In [38]:
LuxuryBrandIds = [957, 2258, 1452, 3297, 8296, 29961, 14635]
luxuryBrandTies = df[df['brandId'].isin(LuxuryBrandIds)]
luxuryBrandTies


Out[38]:
Unnamed: 0 id priceLabel name brandId brandName colorId colorFamily colorNamed
55 55 458283715 295.00 Kiton Donegal Tweed Tie, Brown 1452 Kiton 1 Brown BROWN
91 91 458283723 295.00 Kiton Multi-Stripe Silk Tie, Brown/Teal 1452 Kiton 10 Blue BRN W TEAL
97 97 458916530 225.21 Brunello Cucinelli woven tie 8296 Brunello Cucinelli 14 Gray GREY
124 124 458283728 295.00 Kiton Awning-Stripe Wool/Silk Tie, Brown 1452 Kiton 1 Brown BROWN
191 191 458430699 343.36 Yohji Yamamoto embroidered skull tie 2258 Yohji Yamamoto 16 Black BLACK
198 198 458283727 295.00 Kiton Burnout-Paisley Silk Tie, Blue 1452 Kiton 10 Blue BLUE
235 235 458283725 295.00 Kiton Neon-Neat Silk Tie, Gray 1452 Kiton 14 Gray GREY
264 264 458430891 459.83 Yohji Yamamoto camouflage print tie 2258 Yohji Yamamoto 7 Red RED
293 293 457115459 295.00 Kiton Mixed-Stripe Jacquard Neck Tie 1452 Kiton 10 Blue Blue
393 393 458430718 343.36 Yohji Yamamoto embroidered skull tie 2258 Yohji Yamamoto 7 Red RED
429 429 457661628 250.00 Stefano Ricci Neat Paisley Pattern Silk Tie, Blue 3297 Stefano Ricci 10 Blue BLUE 6
430 430 457017409 280.00 Brunello Cucinelli solid tie 8296 Brunello Cucinelli 1 Brown BROWN
440 440 457115456 295.00 Kiton Diamond Medallion Neck Tie 1452 Kiton 8 Purple Purple
571 571 457115479 365.00 Kiton Woven Cashmere Neck Tie 1452 Kiton 10 Blue Blue
583 583 457585879 245.00 Brunello Cucinelli pointed tie 8296 Brunello Cucinelli 14 Gray GREY
594 594 457115481 295.00 Kiton Medallion Repp Neck Tie 1452 Kiton 8 Purple Purple
602 602 456799492 295.00 Kiton Shadow-Stripe Jacquard Silk Neck Tie 1452 Kiton 10 Blue Blue
615 615 458430848 459.83 Yohji Yamamoto camouflage print tie 2258 Yohji Yamamoto 10 Blue BLUE
636 636 457017632 280.00 Brunello Cucinelli solid tie 8296 Brunello Cucinelli 10 Blue BLUE
704 704 457899789 295.00 Kiton Printed Track-Stripe Tie, Coral 1452 Kiton 3 Orange CORAL
732 732 457017561 265.00 Brunello Cucinelli striped tie 8296 Brunello Cucinelli 1 Brown BROWN
743 743 457115450 295.00 Kiton Paisley Ribbed Neck Tie 1452 Kiton 8 Purple Purple
802 802 457402950 250.00 Stefano Ricci Floral Medallion Pattern Silk Ti... 3297 Stefano Ricci NaN NaN BRG 8
814 814 456999571 265.00 Brunello Cucinelli woven tie 8296 Brunello Cucinelli 1 Brown BROWN
821 821 457115485 295.00 Kiton Diamond Medallion Neck Tie 1452 Kiton 10 Blue Blue
857 857 456601937 240.00 Brunello Cucinelli Wool Bow Tie 8296 Brunello Cucinelli 10 Blue NAVY
894 894 456799507 295.00 Kiton Dot Jacquard Silk Neck Tie 1452 Kiton NaN NaN NaN
914 914 457115472 295.00 Kiton Medallion Jacquard Neck Tie 1452 Kiton 10 Blue Blue
928 928 456601500 285.00 Brunello Cucinelli Wool & Silk Dot Tie 8296 Brunello Cucinelli 14 Gray GREY
951 951 456999171 265.00 Brunello Cucinelli dotted tie 8296 Brunello Cucinelli 1 Brown BROWN
... ... ... ... ... ... ... ... ... ...
4095 4095 456873768 275.00 Stefano Ricci Paisley-Print Woven Silk Tie, Blue 3297 Stefano Ricci 10 Blue BLU 14
4131 4131 456839484 215.00 Stefano Ricci Micro-Medallion Silk Tie, Pink 3297 Stefano Ricci 17 Pink PINK
4151 4151 456839526 295.00 Kiton Striped Silk Tie, Red 1452 Kiton 7 Red RED
4164 4164 456839130 295.00 Kiton Melange Grosgrain-Stripe Tie, Green 1452 Kiton 13 Green GREEN
4171 4171 456840306 250.00 Stefano Ricci Rope Striped Silk Tie, Red/Blue 3297 Stefano Ricci 10 Blue RED/BLUE
4178 4178 456839847 295.00 Kiton Large-Paisley Textured Tie, Blue/Red 1452 Kiton 10 Blue BLUE/RED
4228 4228 456839884 275.00 Stefano Ricci Med-Circle-Medallion Silk Tie, R... 3297 Stefano Ricci 10 Blue RED/BLUE
4245 4245 456886692 295.00 Kiton Floral Medallion-Print Neck Tie 1452 Kiton 10 Blue Blue
4304 4304 456839992 275.00 Stefano Ricci Paisley Silk Tie, Red/Blue 3297 Stefano Ricci 10 Blue RED/BLUE
4387 4387 456869933 250.00 Stefano Ricci Neat Paisley Pattern Silk Tie, Pink 3297 Stefano Ricci 17 Pink PINK 8
4398 4398 456869745 295.00 Kiton Cashmere/Silk Woven Tie, Burgundy 1452 Kiton 7 Red BURGUNDY
4401 4401 456799802 295.00 Kiton Floral Medallion-Pattern Neck Tie 1452 Kiton 10 Blue Blue
4458 4458 456510392 295.00 Kiton Paisley Tie 1452 Kiton 10 Blue Blue
4476 4476 456840346 295.00 Kiton Flower-Neat Tie, Orange 1452 Kiton 3 Orange ORANGE
4481 4481 457115501 295.00 Kiton Medallion Jacquard Neck Tie 1452 Kiton 8 Purple Purple
4492 4492 456869677 295.00 Kiton Square Medallion Pattern Tie, Orange 1452 Kiton 3 Orange ORANGE
4506 4506 456870873 295.00 Kiton Woven Snowflake-Neat Tie, Red 1452 Kiton 7 Red RED
4583 4583 456870074 250.00 Stefano Ricci Medallion Silk Tie, Teal 3297 Stefano Ricci 10 Blue TEAL
4668 4668 456845255 295.00 Kiton Striped Silk Tie, Gray/Blue 1452 Kiton 10 Blue GRAY/BLUE
4681 4681 457115504 295.00 Kiton Medallion Repp Neck Tie 1452 Kiton 1 Brown Brown
4683 4683 457115505 295.00 Kiton Medallion Repp Neck Tie 1452 Kiton 1 Brown Brown
4701 4701 452869925 250.00 Stefano Ricci Neat Paisley Pattern Silk Tie, Pink 3297 Stefano Ricci 17 Pink PINK 8
4708 4708 456840022 250.00 Stefano Ricci Square Micro-Flower Silk Tie, Red 3297 Stefano Ricci 7 Red RED
4748 4748 457115506 295.00 Kiton Paisley Jacquard Neck Tie 1452 Kiton 10 Blue Blue
4890 4890 456498076 239.72 Yohji Yamamoto floral print tie 2258 Yohji Yamamoto 10 Blue BLUE
4912 4912 448851075 260.00 Jupe Faille French-Knot Double Bow Tie 29961 Jupe NaN NaN NaN
4989 4989 456839890 295.00 Kiton Wide Rope-Stripe Woven Tie, Navy/Brown 1452 Kiton 10 Blue NAVY/BROWN
4990 4990 453818215 295.00 Kiton Wool-Silk Track-Stripe Tie, Gray/Pink 1452 Kiton 17 Pink GRAY/PINK
5025 5025 452869918 275.00 Stefano Ricci Floral-Pattern Woven Silk Tie, B... 3297 Stefano Ricci 7 Red BURGUNDY
5039 5039 447160694 295.00 Kiton Woven Polka-Dot Silk Tie, Turquoise 1452 Kiton 10 Blue TURQ

182 rows × 9 columns


In [44]:
luxuryColors = luxuryBrandTies.groupby('colorFamily')
lc2 = luxuryColors['priceLabel'].agg(['count', np.mean, np.min, np.max])
lc2


Out[44]:
count mean amin amax
colorFamily
Black 8 388.920000 158.00 700.00
Blue 82 291.214024 215.00 665.00
Brown 17 305.882353 245.00 665.00
Gray 10 269.021000 195.00 295.00
Green 5 275.000000 225.00 295.00
Orange 9 295.000000 295.00 295.00
Pink 8 266.250000 215.00 295.00
Purple 8 289.375000 250.00 295.00
Red 24 291.472500 192.15 459.83
Yellow 2 295.000000 295.00 295.00

In [45]:
lc2.plot(kind='bar',  figsize=(12, 10), title="Prices")


Out[45]:
<matplotlib.axes._subplots.AxesSubplot at 0x12725fe50>

In [ ]: