CooperationAnalysis


Rand 2011 Cooperation Study

This notebook outlines how to recreate the analysis of the Rand et al. 2011 study "Dynamic social networks promote cooperation in experiments with humans" Link to Paper

This outlines the steps to re-create the analysis using the publicly available data published in the paper. This requires either a local or remote copy of Bedrock with the following Opals installed:

This notebook also requires that bedrock-core be installed locally into the python kernel running this notebook. This can be installed via command line using:

pip install git+https://github.com/Bedrock-py/bedrock-core.git

The other requirements to run this notebook are:

Step 1: Check Environment

First check that Bedrock is installed locally. If the following cell does not run without error, check the install procedure above and try again. Also, ensure that the kernel selected is the same as the kernel where bedrock-core is installed


In [ ]:
from bedrock.client.client import BedrockAPI

Test Connection to Bedrock Server

This code assumes a local bedrock is hosted at localhost on port 81. Change the SERVER variable to match your server's URL and port.


In [ ]:
import requests
import pandas
import pprint
SERVER = "http://localhost:81/"
api = BedrockAPI(SERVER)

Check for Spreadsheet Opal

The following code block checks the Bedrock server for the Spreadsheet Opal. This Opal is used to load .csv, .xls, and other such files into a Bedrock matrix format. The code below calls the Bedrock /dataloaders/ingest endpoint to check if the opals.spreadsheet.Spreadsheet.Spreadsheet opal is installed.

If the code below shows the Opal is not installed, there are two options:

  1. If you are running a local Bedrock or are the administrator of the Bedrock server, install the Spreadsheet Opal with pip on the server Spreadsheet
  2. If you are not administrator of the Bedrock server, e-mail the Bedrock administrator requesting the Opal be installed

In [ ]:
resp = api.ingest("opals.spreadsheet.Spreadsheet.Spreadsheet")
if resp.json():
    print("Spreadsheet Opal Installed!")
else:
    print("Spreadsheet Opal Not Installed!")

Check for logit2 Opal

The following code block checks the Bedrock server for the logit2 Opal.

If the code below shows the Opal is not installed, there are two options:

  1. If you are running a local Bedrock or are the administrator of the Bedrock server, install the logit2 Opal with pip on the server logit2
  2. If you are not administrator of the Bedrock server, e-mail the Bedrock administrator requesting the Opal be installed

In [ ]:
resp = api.analytic('opals.logit2.Logit2.Logit2')
if resp.json():
    print("Logit2 Opal Installed!")
else:
    print("Logit2 Opal Not Installed!")

Check for select-from-dataframe Opal

The following code block checks the Bedrock server for the select-from-dataframe Opal. This allows you to filter by row and reduce the columns in a dataframe loaded by the server.

If the code below shows the Opal is not installed, there are two options:

  1. If you are running a local Bedrock or are the administrator of the Bedrock server, install the select-from-datafram Opal with pip on the server select-from-dataframe
  2. If you are not administrator of the Bedrock server, e-mail the Bedrock administrator requesting the Opal be installed

In [ ]:
resp = api.analytic('opals.select-from-dataframe.SelectByCondition.SelectByCondition')
if resp.json():
    print("Select-from-dataframe Opal Installed!")
else:
    print("Select-from-dataframe Opal Not Installed!")

Check for summarize Opal

The following code block checks the Bedrock server for the summarize Opal. This allows you to summarize a matrix with an optional groupby clause.

If the code below shows the Opal is not installed, there are two options:

  1. If you are running a local Bedrock or are the administrator of the Bedrock server, install the summarize with pip on the server summarize
  2. If you are not administrator of the Bedrock server, e-mail the Bedrock administrator requesting the Opal be installed

In [ ]:
resp = api.analytic('opals.summarize.Summarize.Summarize')
if resp.json():
    print("Summarize Opal Installed!")
else:
    print("Summarize Opal Not Installed!")

Step 2: Upload Data to Bedrock and Create Matrix

Now that everything is installed, begin the workflow by uploading the csv data and creating a matrix. To understand this fully, it is useful to understand how a data loading workflow occurs in Bedrock.

  1. Create a datasource that points to the original source file
  2. Generate a matrix from the data source (filters can be applied during this step to pre-filter the data source on load
  3. Analytics work on the generated matrix

Note: Each time a matrix is generated from a data source it will create a new copy with a new UUID to represent that matrix

Check for csv file locally

The following code opens the file and prints out the first part. The file must be a csv file with a header that has labels for each column. The file is comma delimited csv.


In [ ]:
filepath = 'Rand2011PNAS_cooperation_data.csv'
datafile = pandas.read_csv('Rand2011PNAS_cooperation_data.csv')
datafile.head(10)

Now Upload the source file to the Bedrock Server

This code block uses the Spreadsheet ingest module to upload the source file to Bedrock. Note: This simply copies the file to the server, but does not create a Bedrock Matrix format

If the following fails to upload. Check that the csv file is in the correct comma delimited format with headers.


In [ ]:
ingest_id = 'opals.spreadsheet.Spreadsheet.Spreadsheet'
resp = api.put_source('Rand2011', ingest_id, 'default', {'file': open(filepath, "rb")})

if resp.status_code == 201:
    source_id = resp.json()['src_id']
    print('Source {0} successfully uploaded'.format(filepath))
else:
    try:
        print("Error in Upload: {}".format(resp.json()['msg']))
    except Exception:
        pass
    
    try:
        source_id = resp.json()['src_id']
        print("Using existing source.  If this is not the desired behavior, upload with a different name.")
    except Exception:
        print("No existing source id provided")

Check available data sources for the CSV file

Call the Bedrock sources list to see available data sources. Note, that the Rand2011 data source should now be available


In [ ]:
available_sources = api.list("dataloader", "sources").json()
s = next(filter(lambda source: source['src_id'] == source_id, available_sources),'None')
if s != 'None':
    pp = pprint.PrettyPrinter()
    pp.pprint(s)
else:
    print("Could not find source")

Create a Bedrock Matrix from the CSV Source

In order to use the data, the data source must be converted to a Bedrock matrix. The following code steps through that process. Here we are doing a simple transform of csv to matrix. There are options to apply filters (like renaming columns, excluding colum


In [ ]:
resp = api.create_matrix(source_id, 'rand_mtx')
mtx = resp[0]
matrix_id = mtx['id']
print(mtx)
resp

Look at basic statistics on the source data

Here we can see that Bedrock has computed some basic statistics on the source data.

For numeric data

The quartiles, max, mean, min, and standard deviation are provided

For non-numeric data

The label values and counts for each label are provided.

For both types

The proposed tags and data type that Bedrock is suggesting are provided


In [ ]:
analytic_id = "opals.summarize.Summarize.Summarize"
inputData = {
    'matrix.csv': mtx,
    'features.txt': mtx
}

paramsData = []

summary_mtx = api.run_analytic(analytic_id, mtx, 'rand_mtx_summary', input_data=inputData, parameter_data=paramsData)
output = api.download_results_matrix(matrix_id, summary_mtx['id'], 'matrix.csv')
output

Step 3: Filter the data based on a condition

We are doing step 3 of the Original analysis to compare the effect of decision to defect or cooperate based on the game condition (Fluid, Viscous, Static, Random)


In [ ]:
analytic_id = "opals.select-from-dataframe.SelectByCondition.SelectByCondition"
inputData = {
    'matrix.csv': mtx,
    'features.txt': mtx
}

paramsData = [
    {"attrname":"colname","value":"round_num"},
    {"attrname":"comparator","value":"=="},
    {"attrname":"value","value":"1"}
]

filtered_mtx = api.run_analytic(analytic_id, mtx, 'rand_round1_only', input_data=inputData, parameter_data=paramsData)

filtered_mtx

Check that Matrix is filtered


In [ ]:
output = api.download_results_matrix('rand_mtx', 'rand_round1_only', 'matrix.csv', remote_header_file='features.txt')
output

Step 4: Run Logit2 Analysis

Now we will call the Logit2 Analysis on the matrix. This will run a logit analysis on the features in the matrix


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ condition"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, mtx, 'rand_logit2_step3', input_data=inputData, parameter_data=paramsData)

result_mtx

Visualize the output of the analysis

Here the output of the analysis is downloaded and from here can be visualized and exported


In [ ]:
coef_table = api.download_results_matrix('rand_mtx', 'rand_logit2_step3', 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Analysis

The output of this analysis shows how the game condition interacts with the decision to either defect or cooperate. The coefficients provide the log-odds along with the Pr(z) scores to show the statistical significance. This is filtered only on round_num==1.

The referenced paper used several other comparisons to evaluate different interactions. The following code repeats the procedure above for the remaining analysis

Apply method to complete Rand2011 Analysis

The following cells replicate the other analysis pieces from the Rand2011 study

Summarize decision grouped on condition and round_num


In [ ]:
analytic_id = "opals.summarize.Summarize.Summarize"
inputData = {
    'matrix.csv': mtx,
    'features.txt': mtx
}

paramsData = [
    {"attrname":"groupby","value":"condition,round_num"},
    {"attrname":"columns","value":"decision0d1c"}
]

base_mtx = api.get_matrix_metadata('Rand2011','rand_mtx')

summary_mtx = api.run_analytic(analytic_id, base_mtx,'summarize_grouped', input_data=inputData, parameter_data=paramsData)
output = api.download_results_matrix(base_mtx['id'], summary_mtx['id'], 'matrix.csv')
output

Compare round_num effect on decision


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': base_mtx,
    'features.txt': base_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ round_num"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, base_mtx, 'rand_logit2_step1', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Consider only num_neighbors > 0


In [ ]:
analytic_id = "opals.select-from-dataframe.SelectByCondition.SelectByCondition"
inputData = {
    'matrix.csv': base_mtx,
    'features.txt': base_mtx
}

paramsData = [
    {"attrname":"colname","value":"num_neighbors"},
    {"attrname":"comparator","value":">"},
    {"attrname":"value","value":"0"}
]

filtered_mtx = api.run_analytic(analytic_id, base_mtx, 'rand_has_neighbors', input_data=inputData, parameter_data=paramsData)

Summarize on filtered matrix


In [ ]:
analytic_id = "opals.summarize.Summarize.Summarize"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"groupby","value":"condition,round_num"},
    {"attrname":"columns","value":"decision0d1c"}
]

summary_mtx = api.run_analytic(analytic_id, filtered_mtx,'summarize_grouped', input_data=inputData, parameter_data=paramsData)
output = api.download_results_matrix(base_mtx['id'], summary_mtx['id'], 'matrix.csv')
output

Compare round_num effect on decision only when there are neighbors


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ round_num"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, filtered_mtx, 'rand_logit2_step2', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Compare effect of round_num and Fluid

Look at the effect the round number an if the game is Fluid.


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': base_mtx,
    'features.txt': base_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ fluid_dummy*round_num"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, base_mtx, 'rand_logit2_step4', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Condition effect on decision for Round >= 7


In [ ]:
analytic_id = "opals.select-from-dataframe.SelectByCondition.SelectByCondition"
inputData = {
    'matrix.csv': base_mtx,
    'features.txt': base_mtx
}

paramsData = [
    {"attrname":"colname","value":"round_num"},
    {"attrname":"comparator","value":">="},
    {"attrname":"value","value":"7"}
]

filtered_mtx = api.run_analytic(analytic_id, base_mtx, 'rand_round7', input_data=inputData, parameter_data=paramsData)

In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ condition"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, filtered_mtx, 'rand_logit2_step5', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Fluid Effect on decision for Round >= 7


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ C(fluid_dummy)"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, filtered_mtx, 'rand_logit2_step6', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Relevel on Random and Compare condition effect on decision


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': base_mtx,
    'features.txt': base_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ C(condition, Treatment(reference='Random'))"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, base_mtx, 'rand_logit2_step7', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
pandas.set_option('display.max_colwidth', -1)
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Relevel on Static and Compare condition effect on decision


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': base_mtx,
    'features.txt': base_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ C(condition, Treatment(reference='Static'))"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, base_mtx, 'rand_logit2_step8', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
pandas.set_option('display.max_colwidth', -1)
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Relevel on Random and round_num >= 7


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ C(condition, Treatment(reference='Random'))"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, filtered_mtx, 'rand_logit2_step9', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Relevel on Static and round_num >= 7


In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ C(condition, Treatment(reference='Static'))"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, filtered_mtx, 'rand_logit2_step10', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

Subset on Fluid Condition and look at effect of num_neighbors on decision


In [ ]:
analytic_id = "opals.select-from-dataframe.SelectByCondition.SelectByCondition"
inputData = {
    'matrix.csv': base_mtx,
    'features.txt': base_mtx
}

paramsData = [
    {"attrname":"colname","value":"condition"},
    {"attrname":"comparator","value":"=="},
    {"attrname":"value","value":"Fluid"}
]

filtered_mtx = api.run_analytic(analytic_id, base_mtx, 'rand_fluid_only', input_data=inputData, parameter_data=paramsData)

In [ ]:
analytic_id = "opals.logit2.Logit2.Logit2"
inputData = {
    'matrix.csv': filtered_mtx,
    'features.txt': filtered_mtx
}

paramsData = [
    {"attrname":"formula","value":"decision0d1c ~ C(num_neighbors)"},
    {"attrname":"family","value":"binomial"},
    {"attrname":"clustered_rse","value":"sessionnum,playerid"}
]

result_mtx = api.run_analytic(analytic_id, filtered_mtx, 'rand_logit2_step11', input_data=inputData, parameter_data=paramsData)
coef_table = api.download_results_matrix(base_mtx['id'], result_mtx['id'], 'matrix.csv')
coef_table

In [ ]:
summary_table = api.download_results_matrix(result_mtx['src_id'], result_mtx['id'], 'summary.csv')
summary_table

In [ ]: