In this notebook, we do some basic investigation into the frequency of personal attacks on Wikipedia. We will attempt to provide some insight into the following questions:
What fraction of comments are personal attacks?
What fraction of users have made a personal attack?
What fraction of users have been attacked on their user page?
Are there any temporal trends in the frequency of attacks?
We have 2 separate types of data at our disposal. First, we have a random sample of roughly 100k human-labeled comments. Each comment was annotated by 10 separate people as to whether the comment is a personal attack. We take the majority annotation class to get a single label. Second, we have the full history of comments with scores generated by a machine learning model. Due to the construction of the model, these scores can be interpreted as the probability that the majority of annotators would label the comment as an attack.
In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from load_utils import *
from analysis_utils import compare_groups
In [2]:
d = load_diffs()
df_events, df_blocked_user_text = load_block_events_and_users()
In [3]:
er_t = 0.425
d['2015']['pred_attack'] = (d['2015']['pred_attack_score_uncalibrated'] > er_t).astype(int)
d['sample']['pred_attack'] = (d['sample']['pred_attack_score_uncalibrated'] > er_t).astype(int)
In [4]:
# calibration based
In [5]:
100 * d['2015']['pred_attack_score_calibrated'].mean()
Out[5]:
In [6]:
100 * d['2015'].groupby('ns')['pred_attack_score_calibrated'].mean()
Out[6]:
In [7]:
### threshold based
In [8]:
100 * d['2015']['pred_attack'].mean()
Out[8]:
In [9]:
100 * d['2015'].groupby('ns')['pred_attack'].mean()
Out[9]:
In [10]:
ks = [1,3,5]
attacker_ys = []
receiver_ys = []
for k in ks:
attacker_ys.append(d['2015'].groupby('user_text')['pred_attack'].sum().apply(lambda x: x >= k).mean() * 100)
for k in ks:
receiver_ys.append(d['2015'].query("ns=='user'and user_text != page_title").groupby('page_title')['pred_attack'].sum().apply(lambda x: x >= k).mean() * 100)
In [11]:
df_sns = pd.DataFrame()
df_sns['k'] = ks
df_sns['attackers'] = attacker_ys
df_sns['victims'] = receiver_ys
In [12]:
plt.figure()
sns.set(font_scale=1.5)
f, (ax1, ax2) = plt.subplots(2, sharex=True)
sns.barplot("k", y="attackers", data=df_sns, ax=ax1, color = 'darkblue' , label = "% of users who made \n at least k attacks")
sns.barplot("k", y="victims", data=df_sns, ax=ax2, color = 'darkred', label = "% of users who received \n at least k attacks")
ax1.set( xlabel = '' , ylabel = '%')
ax2.set( ylabel = '%')
ax1.legend()
ax2.legend()
plt.savefig('../../paper/figs/attacker_and_victim_prevalence.png')
In [13]:
plt.figure()
sns.set(font_scale=1.5)
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(8,4))
sns.barplot("k", y="attackers", data=df_sns, ax=ax1, color = 'darkblue' )
sns.barplot("k", y="victims", data=df_sns, ax=ax2, color = 'darkred')
ax1.set( ylabel = '%')
ax2.set( ylabel = '')
plt.text(- 3.0, 1.65,'% of users \nwho made \nat least k attacks',fontsize = 15)
plt.text(0.6, 1.65,'% of users \nwho received \nat least k attacks',fontsize = 15)
ax2.legend()
plt.savefig('../../paper/figs/attacker_and_victim_prevalence.png')
In [14]:
def simulate_num_attacks_within_group(df, group_col = 'user_text'):
return df.assign( uniform = np.random.rand(df.shape[0], 1))\
.assign(is_attack = lambda x: (x.pred_attack_score_calibrated >= x.uniform).astype(int))\
.groupby(group_col)['is_attack']\
.sum()
In [15]:
def get_within_group_metric_interval(df, group_col = 'user_text', metric = lambda x: (x>=1).astype(int).mean() * 100, iters = 2):
results = []
for i in range(iters):
result = simulate_num_attacks_within_group(df, group_col = group_col)
result = metric(result)
results.append(result)
return np.percentile(results, [2.5, 97.5])
In [16]:
def get_intervals(df, group_col = 'user_text', iters = 10):
ks = range(1,6)
y =[]
lower = []
upper = []
intervals = []
for k in ks:
metric = lambda x: (x>=k).astype(int).mean() * 100
interval = get_within_group_metric_interval(d['2015'], group_col = group_col, iters = iters, metric=metric)
intervals.append(interval)
y.append(interval.mean())
lower.append(interval.mean() - interval[0])
upper.append(interval[1] - interval.mean())
return pd.DataFrame({'k': ks, 'y': y, 'interval': intervals, 'lower': lower, 'upper': upper})
In [17]:
get_intervals(d['2015'])
Out[17]:
In [18]:
# ignore anon users
get_intervals(d['2015'].query('not author_anon and not recipient_anon'))
Out[18]:
In [19]:
get_intervals(d['2015'].query("ns=='user'"), group_col = 'page_title')
Out[19]:
In [20]:
# ignore anon users
get_intervals(d['2015'].query("not author_anon and not recipient_anon and ns=='user'"), group_col = 'page_title')
Out[20]:
In [21]:
df_span = d['sample'].query('year > 2003 & year < 2016')
In [22]:
plt.figure(figsize=(8,4))
sns.set(font_scale=1.5)
x = 'year'
s = df_span.groupby(x)['pred_attack'].mean() * 100
plt.plot(s.index, s.values)
plt.xlabel(x)
plt.ylabel('Percent of comments that are attacks')
plt.savefig('../../paper/figs/prevalence_by_year.png')
In [23]:
plt.figure(figsize=(8,4))
sns.set(font_scale=1.5)
x = 'year'
s = df_span.groupby(x)['pred_attack_score_calibrated'].mean() * 100
plt.plot(s.index, s.values)
plt.xlabel('')
plt.ylabel('Percent of comments that are attacks')
#plt.savefig('../../paper/figs/prevalence_by_year.png')
Out[23]:
There is a strong yearly pattern. The fraction of attacks peaked in 2008, which is when participation peaked as well.