CMPSC383 Lab 4

Lab 4 for Computer Science 383: Multi-Agent and Robotic Systems at Allegheny College. By Hawk Weisman and SJ Guillame. See the associated source code and raw data for this assignment in our GitHub repository here.



In [2]:

    
# First, set up Matplotlib and Plot.ly
%matplotlib inline
import matplotlib.pyplot as plt # side-stepping mpl backend
import matplotlib.gridspec as gridspec # subplots
import numpy as np



In [3]:

    
import plotly.plotly as ply
import plotly.tools as tls
from plotly.graph_objs import *
ply.sign_in("hawk", "uknav5itni")



In [4]:

    
from pandas import read_csv

Part I

Question 1.1

If the government legitimacy is very high (close to 1), is the general population able to operate in peaceful coexistence (without cops)?



In [86]:

    
# 0.8 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.8-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.8")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[86]:

With the government legitimacy high, the population is able to operate in peaceful coexistence. The result for running the program with its original parameters (.8 legitimacy) causes an increase from 985 Active People, 0 Jailed, and 135 Quiet to 134A, 844J, and 142Q.



In [85]:

    
# 0.9 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.9-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.9")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[85]:

When the legitimacy increases to 0.9, all of the 1120 agents are quiet and none of them become active, and thus never rebel during the 150 tick timeframe. This represents a population that operates peacefully without the presence of cops in a higly legitimate government. This is shown in the following two graphs.

Question 1.2

Describe what happens in the situations of corruption by reducing the legitimacy variable?

With the original parameter of 0.8 legitimacy an increase from 985 Active People, 0 Jailed, and 135 Quiet to 134A, 844J, and 142Q. That is an increase of -851A, 844J, and -8Q, as seen in the previous plot.



In [84]:

    
# 0.7 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.7-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.7")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[84]:

When the legitimacy decreased to 0.7 the numbers went from 984A, 0J, and 136Q to 148A, 843J, and 129Q. That is an increase of -836A, 843J, -7Q. Compared to the experiment using the original parameter of 0.8 legitimacy, the numbers of active, jailed, and quiet people were comparable.



In [83]:

    
# 0.6 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.6-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.6")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[83]:

When the legitimacy decreased to 0.6 the numbers went from 951A, 0J, 169Q to 169A, 827J, and 142Q. That is an increase of -782A, 827J, -27Q. Compared to the experiment using the original parameter of 0.8 legitimacy: the numbers of active, jailed, and quiet people were different, showing that the decreased legitimacy caused fewer people to be quiet and as a result there were more people being jailed and more active people.



In [82]:

    
# 0.5 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.5-cops0.04-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.5")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[82]:

When the legitimacy decreased to 0.5 the numbers went from 930A, 0J, and 190Q to 120A, 839J, and 161Q. That is an increase of-740A, 839J, -29Q. Compared to the experiment using the original parameter of 0.8 legitimacy: The numbers of active, jailed, and quiet people there was a greater decrease in the number of quiet people and more active people.

Question 1.3

What is the effect of lessening of government oppression by reducing the number of cops?

The results show that in a government with perfect legitimacy, the number of people who are quiet remains constant throughout the experiment at 1120 people. However, when the legitimacy of the government is lower the number of cops is an important regulating factor. Looking at the results for a legitimacy of 0.7 with the number of cops at 0.04, 0.02, and 0.00 there is a clear impact on the presence of cops. At 0.04, there are a total of 148A, 843J, and 129Q people at the end of the experiment, as seen in the graphs At 0.02, there are 284A, 760J, and 76Q at the end of the experiment. Finally, at 0.00 cops, the entire population is active. This demonstrates how cops regulate the population to constrain rebellion. In a society without cops, there is no government oppression and nobody is quiet.



In [90]:

    
# 0.5 legitimacy
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.7-cops0.02-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.7, Cops = 0.02")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[90]:



In [89]:

    
fig1, ax = plt.subplots()
read_csv("results/rebellion-150ticks-legitimacy0.7-cops0.00-people0.7-jail30.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Legitimacy = 0.7, Cops = 0.00")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[89]:

Part II

Learning Implementation

To add learning behaviour to the model, we modified the Person agent to learn the risk of arrest over time based on the number of nearby active people.

In our implementation, every time a person chooses to rebel, the number of nearby active agents within the agent's visibility radius is recorded. If the agent rebels and is not arrested, it attaches a positive reward to that number of nearby active agents, while if it is arrested, a negative reward is attached. Then, when the agent is choosing whether or not to rebel in the future, it searches its' memory for the K nearest neighbors to the current number of nearby active agents and sums them, choosing to rebel if the positive reinforcement outweighs the negative reinforcement.

The K-nearest neighbor search algorithm we implemented is presented below.

private double kNearest(int target) {
        Set<Pair> neighbors = new HashSet<>(); // the nearest neighbors to the target value

        for (int i = 0; i <= NUM_NEIGHBORS; i++) { // repeat search K times

            Pair best = null; // current nearest neighbor
            Integer bestDist = null; // distance to the nearest neighbor

            for (Pair n : this.prevNumActive) { // for each previous observation
                if (    // if we haven't found a best neighbor yet...
                        (best == null && bestDist == null) 
                        // or if the current distance is smaller than the previous best
                        || ((Math.abs(target - n.getNumActive()) < bestDist)
                        // and we haven't already marked this item as a neighbor
                        && !neighbors.contains(n))
                    ) {
                    best = n; // the current element is the nearest neighbor
                    bestDist = Math.abs(target - n.getNumActive()); // update the best distance
                }
            }
            neighbors.add(best);
        }
        double sum = 0.0;
        // return the sum of the rewards for each neighbor
        for (Pair n : neighbors) {
            sum += n.getReinforcement();
        }
        return sum;
    }

For more information on our implementation, please consult the comments in the source code for Person.

Experimental Analysis

In order to demonstrate that this method works, a set of trials were run for 500 ticks, changing the visibility radius of the Person agents. The default simulation parameters of 0.04 cops, 0.7 people, and legitimacy 0.8 were held constant through these trials. In these runs, the Person agents searched for the 50 nearest neighbors to the currently observed number of active agents.



In [5]:

    
fig1, ax = plt.subplots()
read_csv("results/learning-500ticks-visibility2-50neighbors.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Visibility = 2")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[5]:

When the visibility radius is set to 2, patterns similar to those displayed in runs without the learning modification are observed.



In [6]:

    
fig1, ax = plt.subplots()
read_csv("results/learning-500ticks-visibility3-50neighbors.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Visibility = 3")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[6]:

However, when the visibility radius is increased to three, the number of quiet agents eventually surpasses the number of active agents, showing a steady trend. This demonstrates that the Person agents have learned from their experiences and made a change in behavior.



In [7]:

    
fig1, ax = plt.subplots()
read_csv("results/learning-500ticks-visibility5-50neighbors.csv").drop(['tick'], 1).plot(ax=ax, legend=False)
ax.set_title("Visibility = 5")

fig = tls.mpl_to_plotly(fig1)
fig['layout'].update(showlegend=True)
fig.strip_style()
ply.iplot(fig)









    Out[7]:

When the visibility radius is increased to 5, the level of jailed and quiet people show clear trends. Eventually, at 321 ticks, the number of quiet people actually surpasses the number of jailed people. This demonstrates a clear change in behaviour based on previous observation of the environment.

Part III

Learning Take-Aways

The primary thing we learned from this assignment was how to change an agent's learning strategy to improve performance over time. To implement this, the agents were given a reward when they were active and not arrested, and a punishment if they were active and arrested. The agent then uses this reinforcement to decide in the future what action to take based on their memory. Working on this implementation allowed us to learn a great deal about how learning works in autonomous systems.

Additionally, we found the predictions made by the model very interesting. In light of current events, the observation that reduced numbers of police can be used to prevent civil unrest is certainly an interesting one.

Challenges

The primary challenge encountered over the course of this lab assignment involved our implementation of the rebellion strategy for the civilian population. We had a great deal of difficulty choosing what variables should inform our learning strategy. Since we wanted the Person agents to learn to predict te risk of rebelling, we originally had them remember the number of nearby cops. However, this proved to be ineffective, so we attemtped prediciting the risk based on the number of jailed agents, which also was not very effective. Finally, with advice from Dr. Jumadinova, we chose to make our prediction based on the number of active Person agents nearby, which provided accurate predictions, as seen in the plots presented above.

Furthermore, the learning strategy was difficult to implement, because while the agents appeared to have learned not to rebel for a short period of time, the eventual trends indicated that they had not learned. This meant to us that the agents were not learning using this strategy, because their performance was remaining constant within a range.



In [ ]: