ACN 903 S3: Epidemic Competition

Céline Comte & Fabien Mathieu

Until now, we have dealt with contamination with only one type of virus. The only bottleneck was the ability of a node to contaminate another one. For this session, we propose a selection of competition models, where multiple viruses compete for a same resource. The goal is not to be exhaustive (there are basically as many competition models than there are researchers working on that field), but to let you see that a same idea can lead to multiple approaches and applications.

Remarks:

  • This practical will be evaluated. You will need to send your NoteBook to us before next Sunday.
  • Please respect the deadline, even if that means you do not answer all questions.
  • You can work together on the practical, but that does note mean you are allowed to copy/paste everything. In particular, when two of you propose the same code, please tell This is a code I wrote with XXXX, and try to write personal comments that show you understood what the code does. Do not try to hide common work by slight changes (e.g. change of the variable names), this shows.
  • Some of the questions below are open, with a lot of freedom on the way you address them. You can use bibliographic references to answer them, but in that case, you'll need to clearly quote your references. Plagiarism will not be tolerated.

1. The Mitochondrial Eve

The context, origin and details of the model will be explained during the course.

The model works as follows: there is a first generation of $ n $ individuals (the potential eves). At each next generation, there are $ n $ individuals as well. The parent of each individual is chosen uniformly i.i.d. among the individuals of the previous generation. The game stops when one individual of the first generation is the unique ancestor of the current generation: Eve has been found. We propose to study the age of Eve, i.e. the number of generations required for all individuals to share the same Eve. For example, in the picture below, the age of Eve is four.

Question 1

Propose a function age_of_eve that returns for a given $ n $ the age of Eve computed on one instance.

Hints:

  • It is easier to do it backward in time: start from the current generation (the bottom line in the picture above) and go up generation by generation until you found Eve.
  • You're not asked to build the whole family tree, just to compute the age of Eve. For this, you probably only need to track the number of distinct individuals that have a descendant in the current generation.

In [ ]:
%pylab inline

Answer:

Question 2

Compute and display the age of Eve as a function of $ n $ averaged over a few trials. Recommended values for a good display: 100:100:2000, 100 trials. When testing your code, you may choose lower values.

Answer:

Question 3

Try to propose some guesses for the expected age of Eve. This is an open question: you decide your methodology. For example:

  • You can try to propose a value according to your experimental results, or try to infer some bounds. When the number of distinct ancestors is large enough, you can estimate the proportion that disappears going up one generation. When only few distinct ancestors remain, you can estimate the number of generations required in average for having one distinct ancestor less.
  • You can Google some bibliography and explain what you found...

Answer:

Question 4

Mitochondrial Eve is estimated to have lived between 99,000 and 300,000 years ago. Population size at that time was about 500,000. Comment the Mitochondrial Eve model we exposed in this session.

Answer:

Question 5 (bonus)

Try to optimize your code.

Python is not designed to be the fastest language of the world, but that does not mean you shouldn't try to write fast functions. Ideally, your function is fast by design, but sometimes you overlooked a possible computational bottleneck and your execution time goes sky high.

The cProfile module is a good way to investigate the bottlenecks of your code.

For example, you may investigate your solution for Question 2 by doing something like that:


In [ ]:
import cProfile
cProfile.run('age_of_eve_trials()', sort = 'tottime')

Hum, it seems that randint and unique are quite time consuming. Maybe looking into that direction allows to produce a faster code.


In [ ]:
import cProfile
cProfile.run('age_of_eve_trials_2()', sort = 'tottime')

Well, randint and unique have been dealt with, and the code is now much faster. It's your turn!

You may not gain that much, or maybe you'll gain a lot more than that. This does not matter: the goal is for you to develop a methodology for writing efficient code.

Remark: in the correction of this practical, you'll see another way to accelerate your code with the Numba package. You'll learn more about Numba later in the course.

2. The Voter Model

We now propose another approach inspired by game theory: the Voter model. We consider an undirected, connected graph $ G = (V,E) $ with self-loops on all nodes. At time $ t=0 $, each node has a given opinion (distinct nodes can share the same opinion, so there are between 1 and $ n = |V| $ distinct opinions). At each time $ t>1 $, each node picks up a neighbor at random and adopts its opinion at time $ t-1 $. We want to study how the opinions evolve with time.

Question 1

Tell how the Voter model generalizes the Mitochondrial Eve model. Can you guess something about the evolution of opinions over time?

Answer:

Question 2

Let us do some math. Let $ P^ a_t $ a vector of size $ n $ that tells for each node the probability that it has opinion $ a $ at time $ t $: for each $i = 1,\ldots,n$, the $i$-th component $P^a_t(i)$ of $P^a_t$ is the probability that node $i$ has opinion $a$ at time $t$. In particular, the components of $ P^a_0 $ are 1 over all nodes that have initially opinion $ a $, 0 elsewhere.

2.1. Use conditional probabilities to express $P^a_{t+1}$ as a function of $P^a_t$. Does it remind you of something? The answer may be yes or no depending on your background. If it is no, a course will be provided at this point.

2.2. Deduce the limit $P^a_\infty$ of $P^a_t$ when $t \to +\infty$. To do this, you need to use the Perron-Frobenius Theorem recalled below. You can try to intuit the shape of the vector $Q$ by solving the linear system $QA = Q$ for a small value of $n$ (like $n = 3$) and then generalize you result for any value of $n$.

Hint: In case you have no background on Markov chains and stochastic process, you can admit the following result. Don't worry, we'll investigate this in more details later in the course and give some intuition of its meaning.

Right stochastic matrix Let $A = (a_{i,j})$ be a $n\times n$ matrix. We say that $A$ is a right stochastic matrix if:

  • The coefficients of $A$ are non-negative: $\forall i, j, a_{i,j}\geq 0$.
  • Each row sums to 1: $\forall i, \sum_j a_{i,j} = 1$.

Perron-Frobenius Theorem (a variant, actually) If $A$ is right stochastic, irreducible and aperiodic (the last two terms will not be investigated here; just admit it is the case for the considered situation), then $A^t\xrightarrow[t\to \infty]{} B$, where $B$ is the right stochastic matrix having all its rows equal to the same row vector $Q$ defined as the unique normalized solution to the equation $QA = Q$.

Answer of 2.1.:

Answer of 2.2.:

Question 3

You are a high manager from the company Lodi (Lodi is a sort of Apple). You have the budget to select 1000 customers and offer them the brand new Lodi anvil, hoping that this helps you dominate the strategic market of smart anvils against your competitors. You bought from the social network company FriendFace their database, which tells you who's friend with whom. According to the Voter model, what should you do?

Answer:

Question 4 (bonus)

Verify your results with simulations (for instance with some Erdös=Rényi graph and 2 opinions).

Answer:

3. Epidemic Live Streaming

We complete our tour with a completely different framework: P2P Live streaming. There are $ n $ peers that want to watch a Live event. The Live stream is produced by a source that cuts it into chunks of 1 second and can deliver one chunk per second. Each peer has the capacity to deliver 1 chunk per second as well.

This question is based on a part of the following article: https://hal.inria.fr/hal-00668529/document

Question 1

Is it possible for all peers to receive all chunks of the stream in a reasonable time? If no, say why. If yes, propose a scheme to do that and give a bound to the diffusion delay (time between the introduction of a chunk by the source and the completion of its broadcast).

Answer:

Question 2

In order to limit the complexity of the diffusion, we consider the use of a fully randomize distribution scheme. Each second: the source delivers the new chunk to a random uniform peer; each peer chooses the chunk it possesses (if any) with the most recent timestamp and delivers it to a random uniform peer. This strategy is called random peer, latest blind.

  • Using a simulation over $ T $ seconds (recommendation: $ n = T = 1000 $, a single run is enough): display for each chunk its percentage of diffusion; display the average percentage of diffusion of a chunk as a function of its age. Comment the results.
  • Propose a value for the asymptotic eventual percentage of diffusion when $ n $ is large enough. What becomes that value if all peers can deliver $ k $ chunks per second?
  • Discuss the potential interest of such a distribution scheme.

Answer: