In [26]:
import numpy as np
from changepoint.mean_shift_model import MeanShiftModel
ts = np.concatenate([np.random.normal(0, 0.1, 10), np.random.normal(1, 0.1, 10)])
model = MeanShiftModel()
stats_ts, pvals, nums = model.detect_mean_shift(ts, B=10000)
In [27]:
%matplotlib inline
In [28]:
import pylab as pl
In [29]:
pl.plot(ts)
Out[29]:
In [30]:
pl.plot(stats_ts)
Out[30]:
In [31]:
pvals
Out[31]:
In [37]:
np.argmin(pvals)
Out[37]:
In [43]:
np.where(np.array(stats_ts)>1.0)
Out[43]:
One strategy to choose a change point is to pick a point which has a low pvalue and also has a large enough effect size. Note that a changepoint depends on 2 things (a) Effect size and (b) Significance (pvalue). It is possible for very small effect sizes to also be significant. That is why we need to use both criterion to ultimately get a estimate. Here I used the threshold for effect size as 1.0 (this depends on your data) and the significance level of 0.05
In [44]:
import numpy as np
from changepoint.mean_shift_model import MeanShiftModel
ts = np.concatenate([np.random.normal(0, 0.1, 10), np.random.normal(0, 0.1, 10)])
model = MeanShiftModel()
stats_ts, pvals, nums = model.detect_mean_shift(ts, B=10000)
In [45]:
pl.plot(ts)
Out[45]:
In [46]:
pvals
Out[46]:
In [47]:
np.where(np.array(pvals)<0.05)
Out[47]:
In [48]:
np.where(np.array(stats_ts)>1.0)
Out[48]:
Here note that no point is significant as pvals are all > 0.05
Please cite the below if you use this package:
@inproceedings{Kulkarni:langchange, author = {Kulkarni,Vivek and Al-Rfou, Rami and Perozzi,Bryan and Skiena, Steven}, title = {Statistically Significant Detection of Linguistic Change}, booktitle = {Proceedings of the 24th International World Wide Web Conference}, series = {WWW '15}, year = {2015}, location = {Florence, Italy}, numpages = {11}, }
Other references: http://viveksck.github.io/langchangetrack/ and http://www.variation.com/cpa/tech/changepoint.html
In [ ]: