When selecting a discrete distribution, either one or two dimensional, you can
configure a floor option, which defaults to no. The examples below are for
a 2D distribution, but it is completely similar for a 1D distribution. The example
also shows how you can pass a matrix (an array of arrays) as extra data to the
simulation. This is converted to a CSV file, the path of which is communicated
to the simulation program.
In [1]:
# First, we'll load some modules that we're going to need
%matplotlib inline
import matplotlib.pyplot as plt
import pysimpactcyan
import pandas as pd
In [2]:
simpact = pysimpactcyan.PySimpactCyan()
In [3]:
# In this example we're going to (ab)use the geographic location of a person
# to show how to use a 2D discrete distribution. Upon initialization, the
# location of each person is drawn from the discrete distribution.
#
# We don't need many events, just one to make sure that the population is
# initialized. We'll create many men and women in the simulation, but turn
# off relationship formation (using the 'eyecap' setting) to avoid scheduling
# a large amount of events that we don't need anyway.
cfg = { }
cfg["population.eyecap.fraction"] = 0
cfg["population.nummen"] = 100000
cfg["population.numwomen"] = 100000
cfg["population.maxevents"] = 1
# This matrix will be communicated tot the simulation below, and will be the
# basis of our 2D distribution.
probabilities = [
[ 1, 2, 3, 4],
[ 5, 6, 7, 0]
]
# Here, we specify that we're going to use a discrete distribution for the
# location of each person
cfg["person.geo.dist2d.type"] = "discrete"
# By starting a field with "data:", we can refer to a CSV file that's
# generated by data passed to the simulation using the 'dataFiles' argument
# (see below). Here, 'probs' is the name of the data file
cfg["person.geo.dist2d.discrete.densfile"] = "data:probs"
cfg["person.geo.dist2d.discrete.width"] = 4
cfg["person.geo.dist2d.discrete.height"] = 2
# We're going to assign the name 'props' to the probability matrix that was
# specified above, since that's the name we've already used in the config
# settings
data = { "probs": probabilities }
# Finally, we start the simulation, read the person log and plot the location
# of each person as a 2D histogram. This has the structure of the data file
# that was passed.
ret = simpact.run(cfg, "/tmp/simptest", dataFiles=data)
persons = pd.read_csv(ret["logpersons"])
plt.hist2d(persons["XCoord"], persons["YCoord"], bins=20);
In [4]:
# Here, the location of a few persons is displayed. As you can see, the location
# can be anywhere inside the specified region (but with the specified probabilities)
p2 = persons[persons["ID"] < 10]
p2[["ID","XCoord","YCoord"]]
Out[4]:
In [5]:
# The default setting of the 'floor' parameter was 'no'. To see what the effect of
# this parameter is, let's now set it to 'yes'
cfg["person.geo.dist2d.discrete.floor"] = "yes"
ret = simpact.run(cfg, "/tmp/simptest", dataFiles=data)
persons = pd.read_csv(ret["logpersons"])
# When creating the 2D histogram again, you'll notice that only a few points are
# actually used. This is because the 'floor' setting causes only the coordinates
# of the corners to be possible.
plt.hist2d(persons["XCoord"], persons["YCoord"], bins=20);
In [6]:
# The effect of this 'floor' parameter is also clear when showing the location of
# a few persons. These coordinates can no longer vary continuously, but are restricted
# to the corners of the bins.
p2 = persons[persons["ID"] < 10]
p2[["ID","XCoord","YCoord"]]
Out[6]:
In [ ]: