Ch10 Figure2


In [1]:
# That was exactly what happened with this team. They produced dozens of reports every few weeks with small uninteresting conclusions. They found that people who bought paint are more likely to buy in the morning. The people who made large purchases were more likely to buy an appliance. Customers who bought carpet are more likely to buy it on Friday.

items = ['toaster', 'microwave', 'fan', 'cookware set', 'carpet', 'paint', 'decorations']
wkdays = ['Sun','Mon', 'Tue', 'Wed', 'Thur', 'Fri', 'Sat']
times = [0,1,2,3]
data = []

for i in range(1000):
    item = items[rd.randint(0, len(items)-1)]
    dollars = 10
    wkday = wkdays[rd.randint(0,len(wkdays)-1)]
    time = times[rd.randint(0,len(times)-1)]
    
    if item in items[:5]:
        if item == 'cookware set':
            dollars += 100*rd.random()
        elif item == 'toaster':
            dollars += 20*rd.random()
        elif item == 'microwave':
            dollars += 30*rd.random()
        else:
            dollars += 100*rd.random()
    elif item == 'paint':
        dollars += 10*rd.random()
        if rd.random() <= .6:
            time = times[0]
    else:
        dollars += 10*rd.random()
        if rd.random() <= .4:
            wkday = 5
    data.append([i, item, round(dollars,2), wkday, time])

df = pd.DataFrame(data, columns = ['id', 'item', 'dollars', 'wkday', 'time'])
# df.to_csv('csv_output/ch10_fig2.csv', index=False)
df = pd.read_csv('csv_output/ch10_fig2.csv')
df.head()


Out[1]:
id item dollars wkday time
0 0 cookware set 25.71 Sat 2
1 1 carpet 20.38 Mon 2
2 2 microwave 18.60 Fri 3
3 3 cookware set 65.93 Fri 1
4 4 fan 32.31 Sat 3

In [15]:
items = ['toaster', 'microwave', 'fan', 'cookware set', 'carpet', 'paint', 'decorations']
wkdays = ['Sun','Mon', 'Tue', 'Wed', 'Thur', 'Fri', 'Sat']
times = [0,1,2,3]

df = pd.read_csv('csv_output/ch10_fig2.csv')
d2 = df.groupby(['item', 'wkday']).agg({'dollars': 'sum', 'time': 'mean'}).reset_index()

from bokeh.models import HoverTool, ColumnDataSource
from bokeh.plotting import figure, show
from bokeh.sampledata.periodic_table import elements
from bokeh.embed import components, autoload_static
from bokeh.resources import CDN
from bokeh.io import output_notebook

source = ColumnDataSource(
    data=dict(
        item = [x for x in d2['item']],
        days=[str(x) for x in d2['wkday']],
        times=[str(x/2.11) for x in d2['time']],
        count=[str(x/3169.57) for x in d2['dollars']],
        count_color=[(0,0,x/400) for x in d2['dollars']]
    )
)

p = figure(title="item count by days of the week and times of the day", tools="save",
           x_range=[str(x) for x in wkdays], y_range=items)

p.rect(x="days", y="item", width=0.9, height='count', source=source,
       fill_alpha='times', color='blue');

# un comment to show
# output_notebook()
# show(p)

The larger the square is the more dollars is spent at a given day of the week and item. The darker the color is, more sales are made in the later of the day. For paints and decorations, total value are quite small comparing to carpet or cookware or fan. Paints are mostly baught in the morning. Carpet has slightly higher sales value on Friday and cookware has slightly higher sales on Monday.