In [56]:
from IPython.core.display import Image
Image(filename="lt5min.png", width=600)
Out[56]:
In [57]:
Image(filename="ExponentialDistExample.png", width=600)
Out[57]:
In the above we can see that the rate (λ) is 1/10 and we want to know the probability that an event will occur at X <= 5 given that the expected interval is 10
Slide from: https://www.youtube.com/watch?v=IT-0oCOQrBY
A video with moar graphics: https://www.youtube.com/watch?v=4BswLMKgXzU
More data with graphs: https://controls.engin.umich.edu/wiki/index.php/Continuous_Distributions:_normal_and_exponential
In the graph above which is the CDF for exponential distribution, we can see that as the time x increases, the probability of the event occurring approaches 100%
This formula is used as the CDF for cassandra's failure detector which is based on http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.80.7427&rep=rep1&type=pdf
In that document, PHI at time NOW is defined as: PHI = -log10(p_later(Tnow - Tlast))
p_later is the probability that the the value will arrive later than the given time. This is defined as 1 - (1 - e ^ (-λx)) where the interval λ = 1/MEAN, and x is the time since the last heartbeat
The relationship to PHI and the likelyhood of a false positive has to do with the log10. At PHI = 1, p_later is 0.1 or a 10% chance that we still may get a heartbeat. At PHI = 2, p_later is 0.01 or a 1% chance that we still may get a heartbeat, etc.
In [8]:
import math
MEAN = 1.0
t = 3.0
p_later = 1 - (1 - math.e**(-t/MEAN))
PHI = -math.log10(p_later)
'phi: ' + str(PHI), 'p_later:' + str(p_later)
Out[8]:
In [2]:
import math
PHI_FACTOR = 1.0 / math.log(10.0)
MEAN = 1.0
#long t = tnow - tLast;
t = 3.0
PHI = PHI_FACTOR * (t / MEAN)
'phi: ' + str(PHI)
Out[2]:
In [59]:
In [59]: