Lab 4 for Computer Science 383: Multi-Agent and Robotic Systems at Allegheny College. By Hawk Weisman and SJ Guillame. See the associated source code and raw data for this assignment in our GitHub repository here.
In [2]:
# First, set up Matplotlib and Plot.ly
%matplotlib inline
import matplotlib.pyplot as plt # side-stepping mpl backend
import matplotlib.gridspec as gridspec # subplots
import numpy as np
In [3]:
import plotly.plotly as ply
import plotly.tools as tls
from plotly.graph_objs import *
ply.sign_in("hawk", "uknav5itni")
In [4]:
from pandas import read_csv
In [86]:
# 0.8 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.8-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.8")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[86]:
With the government legitimacy high, the population is able to operate in peaceful coexistence. The result for running the program with its original parameters (.8 legitimacy) causes an increase from 985 Active People, 0 Jailed, and 135 Quiet to 134A, 844J, and 142Q.
In [85]:
# 0.9 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.9-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.9")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[85]:
When the legitimacy increases to 0.9, all of the 1120 agents are quiet and none of them become active, and thus never rebel during the 150 tick timeframe. This represents a population that operates peacefully without the presence of cops in a higly legitimate government. This is shown in the following two graphs.
Describe what happens in the situations of corruption by reducing the legitimacy variable?
With the original parameter of 0.8 legitimacy an increase from 985 Active People, 0 Jailed, and 135 Quiet to 134A, 844J, and 142Q. That is an increase of -851A, 844J, and -8Q, as seen in the previous plot.
In [84]:
# 0.7 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.7-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.7")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[84]:
When the legitimacy decreased to 0.7 the numbers went from 984A, 0J, and 136Q to 148A, 843J, and 129Q. That is an increase of -836A, 843J, -7Q. Compared to the experiment using the original parameter of 0.8 legitimacy, the numbers of active, jailed, and quiet people were comparable.
In [83]:
# 0.6 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.6-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.6")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[83]:
When the legitimacy decreased to 0.6 the numbers went from 951A, 0J, 169Q to 169A, 827J, and 142Q. That is an increase of -782A, 827J, -27Q. Compared to the experiment using the original parameter of 0.8 legitimacy: the numbers of active, jailed, and quiet people were different, showing that the decreased legitimacy caused fewer people to be quiet and as a result there were more people being jailed and more active people.
In [82]:
# 0.5 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.5-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.5")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[82]:
When the legitimacy decreased to 0.5 the numbers went from 930A, 0J, and 190Q to 120A, 839J, and 161Q. That is an increase of-740A, 839J, -29Q. Compared to the experiment using the original parameter of 0.8 legitimacy: The numbers of active, jailed, and quiet people there was a greater decrease in the number of quiet people and more active people.
What is the effect of lessening of government oppression by reducing the number of cops?
The results show that in a government with perfect legitimacy, the number of people who are quiet remains constant throughout the experiment at 1120 people. However, when the legitimacy of the government is lower the number of cops is an important regulating factor. Looking at the results for a legitimacy of 0.7 with the number of cops at 0.04, 0.02, and 0.00 there is a clear impact on the presence of cops. At 0.04, there are a total of 148A, 843J, and 129Q people at the end of the experiment, as seen in the graphs At 0.02, there are 284A, 760J, and 76Q at the end of the experiment. Finally, at 0.00 cops, the entire population is active. This demonstrates how cops regulate the population to constrain rebellion. In a society without cops, there is no government oppression and nobody is quiet.
In [90]:
# 0.5 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.7-cops0.02-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.7, Cops = 0.02")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[90]:
In [89]:
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.7-cops0.00-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.7, Cops = 0.00")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[89]:
To add learning behaviour to the model, we modified the Person
agent to learn the risk of arrest over time based on the number of nearby active people.
In our implementation, every time a person chooses to rebel, the number of nearby active agents within the agent's visibility radius is recorded. If the agent rebels and is not arrested, it attaches a positive reward to that number of nearby active agents, while if it is arrested, a negative reward is attached. Then, when the agent is choosing whether or not to rebel in the future, it searches its' memory for the K nearest neighbors to the current number of nearby active agents and sums them, choosing to rebel if the positive reinforcement outweighs the negative reinforcement.
The K-nearest neighbor search algorithm we implemented is presented below.
private double kNearest(int target) {
Set<Pair> neighbors = new HashSet<>(); // the nearest neighbors to the target value
for (int i = 0; i <= NUM_NEIGHBORS; i++) { // repeat search K times
Pair best = null; // current nearest neighbor
Integer bestDist = null; // distance to the nearest neighbor
for (Pair n : this.prevNumActive) { // for each previous observation
if ( // if we haven't found a best neighbor yet...
(best == null && bestDist == null)
// or if the current distance is smaller than the previous best
|| ((Math.abs(target - n.getNumActive()) < bestDist)
// and we haven't already marked this item as a neighbor
&& !neighbors.contains(n))
) {
best = n; // the current element is the nearest neighbor
bestDist = Math.abs(target - n.getNumActive()); // update the best distance
}
}
neighbors.add(best);
}
double sum = 0.0;
// return the sum of the rewards for each neighbor
for (Pair n : neighbors) {
sum += n.getReinforcement();
}
return sum;
}
For more information on our implementation, please consult the comments in the source code for Person
.
In order to demonstrate that this method works, a set of trials were run for 500 ticks, changing the visibility radius of the Person
agents. The default simulation parameters of 0.04 cops, 0.7 people, and legitimacy 0.8 were held constant through these trials. In these runs, the Person
agents searched for the 50 nearest neighbors to the currently observed number of active agents.
In [5]:
fig1, ax = plt.subplots()
read_csv("results/learning-500ticks-visibility2-50neighbors.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Visibility = 2")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[5]:
When the visibility radius is set to 2, patterns similar to those displayed in runs without the learning modification are observed.
In [6]:
fig1, ax = plt.subplots()
read_csv("results/learning-500ticks-visibility3-50neighbors.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Visibility = 3")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[6]:
However, when the visibility radius is increased to three, the number of quiet agents eventually surpasses the number of active agents, showing a steady trend. This demonstrates that the Person
agents have learned from their experiences and made a change in behavior.
In [7]:
fig1, ax = plt.subplots()
read_csv("results/learning-500ticks-visibility5-50neighbors.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Visibility = 5")
fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)
Out[7]:
When the visibility radius is increased to 5, the level of jailed and quiet people show clear trends. Eventually, at 321 ticks, the number of quiet people actually surpasses the number of jailed people. This demonstrates a clear change in behaviour based on previous observation of the environment.
The primary thing we learned from this assignment was how to change an agent's learning strategy to improve performance over time. To implement this, the agents were given a reward when they were active and not arrested, and a punishment if they were active and arrested. The agent then uses this reinforcement to decide in the future what action to take based on their memory. Working on this implementation allowed us to learn a great deal about how learning works in autonomous systems.
Additionally, we found the predictions made by the model very interesting. In light of current events, the observation that reduced numbers of police can be used to prevent civil unrest is certainly an interesting one.
The primary challenge encountered over the course of this lab assignment involved our implementation of the rebellion strategy for the civilian population. We had a great deal of difficulty choosing what variables should inform our learning strategy. Since we wanted the Person
agents to learn to predict te risk of rebelling, we originally had them remember the number of nearby cops. However, this proved to be ineffective, so we attemtped prediciting the risk based on the number of jailed agents, which also was not very effective. Finally, with advice from Dr. Jumadinova, we chose to make our prediction based on the number of active Person
agents nearby, which provided accurate predictions, as seen in the plots presented above.
Furthermore, the learning strategy was difficult to implement, because while the agents appeared to have learned not to rebel for a short period of time, the eventual trends indicated that they had not learned. This meant to us that the agents were not learning using this strategy, because their performance was remaining constant within a range.
In [ ]: