Think Bayes

Copyright 2018 Allen B. Downey

MIT License: https://opensource.org/licenses/MIT


In [1]:
# Configure Jupyter so figures appear in the notebook
%matplotlib inline

# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'

import numpy as np
import pandas as pd

from scipy.stats import poisson

# import classes from thinkbayes2
from thinkbayes2 import Pmf, Cdf, Suite, Joint

import thinkbayes2
import thinkplot

import pymc3 as pm
import theano.tensor as T


/home/downey/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

Fake data


In [2]:
n = 60
t1 = 30
t2 = n-t1
lam1 = 4
lam2 = 2


Out[2]:
2

In [3]:
before = poisson(lam1).rvs(t1)


Out[3]:
array([4, 4, 2, 6, 4, 4, 0, 5, 6, 5, 3, 5, 6, 1, 3, 3, 6, 3, 2, 5, 2, 9,
       2, 6, 5, 5, 6, 2, 1, 4])

In [4]:
after = poisson(lam2).rvs(t2)


Out[4]:
array([ 1,  5,  1,  1,  0,  4,  1,  3,  1,  2,  1,  1,  2,  2,  1,  3,  2,
        2,  3,  1,  0, 11,  1,  4,  3,  3,  1,  1,  0,  3])

In [5]:
data = np.concatenate([before, after])


Out[5]:
array([ 4,  4,  2,  6,  4,  4,  0,  5,  6,  5,  3,  5,  6,  1,  3,  3,  6,
        3,  2,  5,  2,  9,  2,  6,  5,  5,  6,  2,  1,  4,  1,  5,  1,  1,
        0,  4,  1,  3,  1,  2,  1,  1,  2,  2,  1,  3,  2,  2,  3,  1,  0,
       11,  1,  4,  3,  3,  1,  1,  0,  3])

Grid algorithm


In [6]:
class Change(Suite, Joint):
    
    def Likelihood(self, data, hypo):
        """
        
        data: array of counts
        hypo: t, lam1, lam2
        """
        # FILL THIS IN
        return 1

In [ ]:


In [ ]:

MCMC

To implement this model in PyMC, see Chapter 1 of Bayesian Methods for Hackers and this example from Computational Statistics in Python


In [7]:
stop


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-4f76a9dad686> in <module>()
----> 1 stop

NameError: name 'stop' is not defined

Real data

Some real data, based on this analysis from the Baltimore Sun


In [ ]:
# !wget https://raw.githubusercontent.com/baltimore-sun-data/2018-shootings-analysis/master/BPD_Part_1_Victim_Based_Crime_Data.csv

In [ ]:
df = pd.read_csv('BPD_Part_1_Victim_Based_Crime_Data.csv', parse_dates=[0])
df.head()

In [ ]:
df.shape

In [ ]:
shootings = df[df.Description.isin(['HOMICIDE', 'SHOOTING']) & (df.Weapon == 'FIREARM')]
shootings.shape

In [ ]:
grouped = shootings.groupby('CrimeDate')

In [ ]:
counts = grouped['Total Incidents'].sum()
counts.head()

In [ ]:
index = pd.date_range(counts.index[0], counts.index[-1])

In [ ]:
counts = counts.reindex(index, fill_value=0)
counts.head()

In [ ]:
counts.plot()
thinkplot.decorate(xlabel='Date',
                   ylabel='Number of shootings')