Prevalence of Personal Attacks

In this notebook, we do some basic investigation into the frequency of personal attacks on Wikipedia. We will attempt to provide some insight into the following questions:

What fraction of comments are personal attacks?
What fraction of users have made a personal attack?
What fraction of users have been attacked on their user page?
Are there any temporal trends in the frequency of attacks?

We have 2 separate types of data at our disposal. First, we have a random sample of roughly 100k human-labeled comments. Each comment was annotated by 10 separate people as to whether the comment is a personal attack. We take the majority annotation class to get a single label. Second, we have the full history of comments with scores generated by a machine learning model. Due to the construction of the model, these scores can be interpreted as the probability that the majority of annotators would label the comment as an attack.



In [1]:

    
%load_ext autoreload
%autoreload 2
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from load_utils import *
from analysis_utils import compare_groups



In [2]:

    
d = load_diffs()
df_events, df_blocked_user_text = load_block_events_and_users()



In [3]:

    
er_t = 0.425
d['2015']['pred_attack'] = (d['2015']['pred_attack_score_uncalibrated'] > er_t).astype(int)
d['sample']['pred_attack'] = (d['sample']['pred_attack_score_uncalibrated'] > er_t).astype(int)

Q: What fraction of comments are personal attacks?



In [4]:

    
# calibration based



In [5]:

    
100 * d['2015']['pred_attack_score_calibrated'].mean()









    Out[5]:





0.7219126397993685



In [6]:

    
100 * d['2015'].groupby('ns')['pred_attack_score_calibrated'].mean()









    Out[6]:





ns
article    0.420763
user       1.042068
Name: pred_attack_score_calibrated, dtype: float64



In [7]:

    
### threshold based



In [8]:

    
100 * d['2015']['pred_attack'].mean()









    Out[8]:





0.7142112371277175



In [9]:

    
100 * d['2015'].groupby('ns')['pred_attack'].mean()









    Out[9]:





ns
article    0.397500
user       1.050909
Name: pred_attack, dtype: float64

Q: What fraction of users have made/received at least k personal attacks?

1. threshold based



In [10]:

    
ks = [1,3,5]
attacker_ys = []
receiver_ys = []

for k in ks:
    attacker_ys.append(d['2015'].groupby('user_text')['pred_attack'].sum().apply(lambda x: x >= k).mean() * 100)
    
for k in ks:
    receiver_ys.append(d['2015'].query("ns=='user'and user_text != page_title").groupby('page_title')['pred_attack'].sum().apply(lambda x: x >= k).mean() * 100)



In [11]:

    
df_sns = pd.DataFrame()
df_sns['k'] = ks
df_sns['attackers'] = attacker_ys
df_sns['victims'] = receiver_ys



In [12]:

    
plt.figure()

sns.set(font_scale=1.5)
f, (ax1, ax2) = plt.subplots(2, sharex=True)
sns.barplot("k", y="attackers", data=df_sns, ax=ax1, color =  'darkblue' , label = "% of users who made     \n at least k attacks")
sns.barplot("k", y="victims", data=df_sns, ax=ax2,  color =  'darkred', label =    "% of users who received \n at least k attacks")
ax1.set( xlabel = '' , ylabel = '%')
ax2.set( ylabel = '%')

ax1.legend()
ax2.legend()
plt.savefig('../../paper/figs/attacker_and_victim_prevalence.png')









    





<matplotlib.figure.Figure at 0x157fd77b8>



In [13]:

    
plt.figure()

sns.set(font_scale=1.5)
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(8,4))
sns.barplot("k", y="attackers", data=df_sns, ax=ax1, color =  'darkblue' )
sns.barplot("k", y="victims", data=df_sns, ax=ax2,  color =  'darkred')
ax1.set( ylabel = '%')
ax2.set( ylabel = '')
plt.text(- 3.0, 1.65,'% of users \nwho made \nat least k attacks',fontsize = 15)
plt.text(0.6, 1.65,'% of users \nwho received \nat least k attacks',fontsize = 15)

ax2.legend()
plt.savefig('../../paper/figs/attacker_and_victim_prevalence.png')









    





<matplotlib.figure.Figure at 0x1a4c405c0>

2. calibration based

Take unsampled data. For each comment, let it be an attack with probability equal to the model prediction. Count the number of users that have made at least 1 attack. Repeat.



In [14]:

    
def simulate_num_attacks_within_group(df, group_col = 'user_text'):
    return df.assign( uniform = np.random.rand(df.shape[0], 1))\
        .assign(is_attack = lambda x: (x.pred_attack_score_calibrated >= x.uniform).astype(int))\
        .groupby(group_col)['is_attack']\
        .sum()



In [15]:

    
def get_within_group_metric_interval(df, group_col = 'user_text', metric = lambda x: (x>=1).astype(int).mean() * 100, iters = 2):
    results = []
    for i in range(iters):
        result = simulate_num_attacks_within_group(df, group_col = group_col)
        result = metric(result)
        results.append(result)
    return np.percentile(results, [2.5, 97.5])



In [16]:

    
def get_intervals(df, group_col = 'user_text', iters = 10):

    ks = range(1,6)
    y =[]
    lower = []
    upper = []
    intervals = []

    for k in ks:
        metric = lambda x: (x>=k).astype(int).mean() * 100
        interval = get_within_group_metric_interval(d['2015'], group_col = group_col, iters = iters, metric=metric)
        intervals.append(interval)
        y.append(interval.mean())
        lower.append(interval.mean() - interval[0])
        upper.append(interval[1] - interval.mean())
        
    return pd.DataFrame({'k': ks, 'y': y, 'interval': intervals, 'lower': lower, 'upper': upper})



In [17]:

    
get_intervals(d['2015'])









    Out[17]:






  
    
      
      interval
      k
      lower
      upper
      y
    
  
  
    
      0
      [3.85717147436, 3.90753777473]
      1
      0.025183
      0.025183
      3.882355
    
    
      1
      [0.855170470555, 0.885989010989]
      2
      0.015409
      0.015409
      0.870580
    
    
      2
      [0.4048015286, 0.41891201747]
      3
      0.007055
      0.007055
      0.411857
    
    
      3
      [0.234760231755, 0.246933555227]
      4
      0.006087
      0.006087
      0.240847
    
    
      4
      [0.153465324739, 0.165781734291]
      5
      0.006158
      0.006158
      0.159624



In [18]:

    
# ignore anon users
get_intervals(d['2015'].query('not author_anon and not recipient_anon'))









    Out[18]:






  
    
      
      interval
      k
      lower
      upper
      y
    
  
  
    
      0
      [3.85692932868, 3.90649214567]
      1
      0.024781
      0.024781
      3.881711
    
    
      1
      [0.857867092843, 0.899615208509]
      2
      0.020874
      0.020874
      0.878741
    
    
      2
      [0.403095502254, 0.41863685193]
      3
      0.007771
      0.007771
      0.410866
    
    
      3
      [0.236576324317, 0.254219938715]
      4
      0.008822
      0.008822
      0.245398
    
    
      4
      [0.155666649056, 0.165880793886]
      5
      0.005107
      0.005107
      0.160774



In [19]:

    
get_intervals(d['2015'].query("ns=='user'"), group_col = 'page_title')









    Out[19]:






  
    
      
      interval
      k
      lower
      upper
      y
    
  
  
    
      0
      [1.90727792059, 1.93320765496]
      1
      0.012965
      0.012965
      1.920243
    
    
      1
      [0.437063696087, 0.455115681234]
      2
      0.009026
      0.009026
      0.446090
    
    
      2
      [0.217766352471, 0.223427592117]
      3
      0.002831
      0.002831
      0.220597
    
    
      3
      [0.131648100543, 0.138423307626]
      4
      0.003388
      0.003388
      0.135036
    
    
      4
      [0.0878263353328, 0.0956640959726]
      5
      0.003919
      0.003919
      0.091745



In [20]:

    
# ignore anon users
get_intervals(d['2015'].query("not author_anon and not recipient_anon and ns=='user'"), group_col = 'page_title')









    Out[20]:






  
    
      
      interval
      k
      lower
      upper
      y
    
  
  
    
      0
      [1.90000571265, 1.9395944016]
      1
      0.019794
      0.019794
      1.919800
    
    
      1
      [0.434264495858, 0.449848614682]
      2
      0.007792
      0.007792
      0.442057
    
    
      2
      [0.214721508141, 0.225524135961]
      3
      0.005401
      0.005401
      0.220123
    
    
      3
      [0.130025706941, 0.140451299629]
      4
      0.005213
      0.005213
      0.135239
    
    
      4
      [0.0875007140817, 0.095332762068]
      5
      0.003916
      0.003916
      0.091417

Q: Has the proportion of attacks changed year over year?



In [21]:

    
df_span = d['sample'].query('year > 2003 & year < 2016')



In [22]:

    
plt.figure(figsize=(8,4))
sns.set(font_scale=1.5)
x = 'year'
s = df_span.groupby(x)['pred_attack'].mean() * 100
plt.plot(s.index, s.values)
plt.xlabel(x)
plt.ylabel('Percent of comments that are attacks')
plt.savefig('../../paper/figs/prevalence_by_year.png')



In [23]:

    
plt.figure(figsize=(8,4))
sns.set(font_scale=1.5)
x = 'year'
s = df_span.groupby(x)['pred_attack_score_calibrated'].mean() * 100
plt.plot(s.index, s.values)
plt.xlabel('')
plt.ylabel('Percent of comments that are attacks')
#plt.savefig('../../paper/figs/prevalence_by_year.png')









    Out[23]:





<matplotlib.text.Text at 0x17ead9a90>

There is a strong yearly pattern. The fraction of attacks peaked in 2008, which is when participation peaked as well.

	interval	k	lower	upper	y
0	[3.85717147436, 3.90753777473]	1	0.025183	0.025183	3.882355
1	[0.855170470555, 0.885989010989]	2	0.015409	0.015409	0.870580
2	[0.4048015286, 0.41891201747]	3	0.007055	0.007055	0.411857
3	[0.234760231755, 0.246933555227]	4	0.006087	0.006087	0.240847
4	[0.153465324739, 0.165781734291]	5	0.006158	0.006158	0.159624

	interval	k	lower	upper	y
0	[3.85692932868, 3.90649214567]	1	0.024781	0.024781	3.881711
1	[0.857867092843, 0.899615208509]	2	0.020874	0.020874	0.878741
2	[0.403095502254, 0.41863685193]	3	0.007771	0.007771	0.410866
3	[0.236576324317, 0.254219938715]	4	0.008822	0.008822	0.245398
4	[0.155666649056, 0.165880793886]	5	0.005107	0.005107	0.160774

	interval	k	lower	upper	y
0	[1.90727792059, 1.93320765496]	1	0.012965	0.012965	1.920243
1	[0.437063696087, 0.455115681234]	2	0.009026	0.009026	0.446090
2	[0.217766352471, 0.223427592117]	3	0.002831	0.002831	0.220597
3	[0.131648100543, 0.138423307626]	4	0.003388	0.003388	0.135036
4	[0.0878263353328, 0.0956640959726]	5	0.003919	0.003919	0.091745

	interval	k	lower	upper	y
0	[1.90000571265, 1.9395944016]	1	0.019794	0.019794	1.919800
1	[0.434264495858, 0.449848614682]	2	0.007792	0.007792	0.442057
2	[0.214721508141, 0.225524135961]	3	0.005401	0.005401	0.220123
3	[0.130025706941, 0.140451299629]	4	0.005213	0.005213	0.135239
4	[0.0875007140817, 0.095332762068]	5	0.003916	0.003916	0.091417