Fact-Checking Facebook Politics Pages — Analysis

See this page for context.

Prepare data


In [1]:
import pandas as pd

In [2]:
percentify = lambda x: (x * 100).round(1).astype(str) + "%"

In [3]:
posts = pd.read_csv("../data/facebook-fact-check.csv")

In [4]:
len(posts)


Out[4]:
2282

In [5]:
ENGAGEMENT_COLS = [
    "share_count",
    "reaction_count",
    "comment_count"
]

In [6]:
RATINGS = ["mostly false", "mixture of true and false", "mostly true", "no factual content"]
FACTUAL_RATINGS = ["mostly false", "mixture of true and false", "mostly true"]

In [7]:
category_grp = posts.groupby("Category")
page_grp = posts.groupby([ "Category", "Page" ])
type_grp = posts.groupby([ "Category", "Page", "Post Type" ])

Rating by category

Counts:


In [8]:
rating_by_category = category_grp["Rating"].value_counts().unstack()[RATINGS].fillna(0)
rating_by_category["total"] = rating_by_category.sum(axis=1)
rating_by_category


Out[8]:
Rating mostly false mixture of true and false mostly true no factual content total
Category
left 22 68 265 116 471
mainstream 0 8 1085 52 1145
right 82 169 319 96 666

Percentages, of all posts:


In [9]:
(rating_by_category[RATINGS].T / rating_by_category[RATINGS].sum(axis=1)).T\
    .pipe(percentify)


Out[9]:
mostly false mixture of true and false mostly true no factual content
Category
left 4.7% 14.4% 56.3% 24.6%
mainstream 0.0% 0.7% 94.8% 4.5%
right 12.3% 25.4% 47.9% 14.4%

Percentages, of posts not rated "no factual content":


In [10]:
(rating_by_category[FACTUAL_RATINGS].T / rating_by_category[FACTUAL_RATINGS].sum(axis=1)).T\
    .pipe(percentify)


Out[10]:
mostly false mixture of true and false mostly true
Category
left 6.2% 19.2% 74.6%
mainstream 0.0% 0.7% 99.3%
right 14.4% 29.6% 56.0%

Rating by page

Counts:


In [11]:
rating_by_page = page_grp["Rating"].value_counts().unstack()[RATINGS].fillna(0)
rating_by_page["total"] = rating_by_page.sum(axis=1)
rating_by_page


Out[11]:
Rating mostly false mixture of true and false mostly true no factual content total
Category Page
left Addicting Info 8 25 96 11 140
Occupy Democrats 9 33 102 65 209
The Other 98% 5 10 67 40 122
mainstream ABC News Politics 0 2 172 26 200
CNN Politics 0 4 385 20 409
Politico 0 2 528 6 536
right Eagle Rising 30 54 121 81 286
Freedom Daily 26 26 56 4 112
Right Wing News 26 89 142 11 268

Percentages, of all posts:


In [12]:
(rating_by_page[RATINGS].T / rating_by_page[RATINGS].sum(axis=1)).T\
    .pipe(percentify)


Out[12]:
mostly false mixture of true and false mostly true no factual content
Category Page
left Addicting Info 5.7% 17.9% 68.6% 7.9%
Occupy Democrats 4.3% 15.8% 48.8% 31.1%
The Other 98% 4.1% 8.2% 54.9% 32.8%
mainstream ABC News Politics 0.0% 1.0% 86.0% 13.0%
CNN Politics 0.0% 1.0% 94.1% 4.9%
Politico 0.0% 0.4% 98.5% 1.1%
right Eagle Rising 10.5% 18.9% 42.3% 28.3%
Freedom Daily 23.2% 23.2% 50.0% 3.6%
Right Wing News 9.7% 33.2% 53.0% 4.1%

Percentages, of posts not rated "no factual content":


In [13]:
(rating_by_page[FACTUAL_RATINGS].T / rating_by_page[FACTUAL_RATINGS].sum(axis=1)).T\
    .pipe(percentify)


Out[13]:
mostly false mixture of true and false mostly true
Category Page
left Addicting Info 6.2% 19.4% 74.4%
Occupy Democrats 6.2% 22.9% 70.8%
The Other 98% 6.1% 12.2% 81.7%
mainstream ABC News Politics 0.0% 1.1% 98.9%
CNN Politics 0.0% 1.0% 99.0%
Politico 0.0% 0.4% 99.6%
right Eagle Rising 14.6% 26.3% 59.0%
Freedom Daily 24.1% 24.1% 51.9%
Right Wing News 10.1% 34.6% 55.3%

Number of posts by date

Counts:


In [14]:
posts_by_date_by_category = category_grp["Date Published"].value_counts().unstack()
posts_by_date_by_category["Avg. Per Day"] = posts_by_date_by_category.mean(axis=1).round(0)
posts_by_date_by_category


Out[14]:
Date Published 2016-09-19 2016-09-20 2016-09-21 2016-09-22 2016-09-23 2016-09-26 2016-09-27 Avg. Per Day
Category
left 55 70 58 54 66 80 88 67
mainstream 154 156 151 146 135 223 180 164
right 97 91 97 93 93 100 95 95

In [15]:
posts_by_date_by_page = page_grp["Date Published"].value_counts().unstack()
posts_by_date_by_page["Avg. Per Day"] = posts_by_date_by_page.mean(axis=1).round(0)
posts_by_date_by_page


Out[15]:
Date Published 2016-09-19 2016-09-20 2016-09-21 2016-09-22 2016-09-23 2016-09-26 2016-09-27 Avg. Per Day
Category Page
left Addicting Info 22 18 17 21 22 17 23 20
Occupy Democrats 20 30 20 19 29 47 44 30
The Other 98% 13 22 21 14 15 16 21 17
mainstream ABC News Politics 36 22 23 21 22 47 29 29
CNN Politics 54 61 53 62 48 66 65 58
Politico 64 73 75 63 65 110 86 77
right Eagle Rising 41 41 42 41 41 41 39 41
Freedom Daily 19 16 17 15 15 15 15 16
Right Wing News 37 34 38 37 37 44 41 38

Rating by post type


In [16]:
rating_by_post_type = type_grp["Rating"].value_counts().unstack()[RATINGS].fillna(0)
rating_by_post_type["total"] = rating_by_post_type.sum(axis=1)
rating_by_post_type


Out[16]:
Rating mostly false mixture of true and false mostly true no factual content total
Category Page Post Type
left Addicting Info link 8 25 94 7 134
photo 0 0 1 3 4
video 0 0 1 1 2
Occupy Democrats link 7 26 60 1 94
photo 2 2 25 49 78
video 0 5 17 15 37
The Other 98% link 1 7 40 3 51
photo 4 0 11 26 41
video 0 3 16 11 30
mainstream ABC News Politics link 0 2 104 2 108
photo 0 0 13 0 13
text 0 0 1 0 1
video 0 0 54 24 78
CNN Politics link 0 4 316 10 330
photo 0 0 3 5 8
text 0 0 1 0 1
video 0 0 65 5 70
Politico link 0 2 458 3 463
photo 0 0 5 2 7
text 0 0 1 0 1
video 0 0 64 1 65
right Eagle Rising link 27 50 117 37 231
photo 3 3 3 37 46
video 0 1 1 7 9
Freedom Daily link 26 25 56 4 111
text 0 1 0 0 1
Right Wing News link 26 88 140 4 258
photo 0 1 2 7 10

Engagement

Count of missing engagement figures:


In [17]:
posts[ENGAGEMENT_COLS].isnull().sum()


Out[17]:
share_count       70
reaction_count     2
comment_count      2
dtype: int64

Median engagement by page


In [18]:
page_grp[ENGAGEMENT_COLS].median().round()


Out[18]:
share_count reaction_count comment_count
Category Page
left Addicting Info 563 2230 271
Occupy Democrats 10931 22360 1205
The Other 98% 3942 12083 521
mainstream ABC News Politics 13 80 28
CNN Politics 50 340 194
Politico 33 314 95
right Eagle Rising 92 186 22
Freedom Daily 947 2245 214
Right Wing News 266 913 91

Average engagement by page


In [19]:
page_grp[ENGAGEMENT_COLS].mean().round()


Out[19]:
share_count reaction_count comment_count
Category Page
left Addicting Info 1270 3120 392
Occupy Democrats 29205 34669 2858
The Other 98% 18007 20971 915
mainstream ABC News Politics 44 177 71
CNN Politics 183 678 322
Politico 182 900 170
right Eagle Rising 616 520 79
Freedom Daily 2474 3685 516
Right Wing News 1398 2454 360

Engagement by truthfulness


In [20]:
grp = posts.groupby([ "Category", "Page", "Rating" ])

Counts:


In [21]:
grp[ENGAGEMENT_COLS].size().unstack().fillna(0)


Out[21]:
Rating mixture of true and false mostly false mostly true no factual content
Category Page
left Addicting Info 25 8 96 11
Occupy Democrats 33 9 102 65
The Other 98% 10 5 67 40
mainstream ABC News Politics 2 0 172 26
CNN Politics 4 0 385 20
Politico 2 0 528 6
right Eagle Rising 54 30 121 81
Freedom Daily 26 26 56 4
Right Wing News 89 26 142 11

Medians:


In [22]:
grp[ENGAGEMENT_COLS].median().round()


Out[22]:
share_count reaction_count comment_count
Category Page Rating
left Addicting Info mixture of true and false 1132 3087 402
mostly false 285 1910 394
mostly true 523 1966 235
no factual content 399 2351 153
Occupy Democrats mixture of true and false 10654 17085 1461
mostly false 5541 17525 638
mostly true 7755 15951 1090
no factual content 18345 37326 1396
The Other 98% mixture of true and false 4749 9040 742
mostly false 11571 19682 930
mostly true 2896 7082 372
no factual content 10337 25951 638
mainstream ABC News Politics mixture of true and false 76 479 59
mostly true 12 78 28
no factual content 38 78 23
CNN Politics mixture of true and false 270 1374 315
mostly true 48 343 194
no factual content 64 245 184
Politico mixture of true and false 7325 20344 1510
mostly true 33 310 96
no factual content 48 309 48
right Eagle Rising mixture of true and false 110 222 32
mostly false 534 551 55
mostly true 46 111 17
no factual content 250 300 17
Freedom Daily mixture of true and false 342 922 140
mostly false 1623 2455 276
mostly true 908 2476 191
no factual content 2025 4264 262
Right Wing News mixture of true and false 457 1267 131
mostly false 790 1772 201
mostly true 91 400 50
no factual content 4933 8539 201

Averages:


In [23]:
grp[ENGAGEMENT_COLS].mean().round()


Out[23]:
share_count reaction_count comment_count
Category Page Rating
left Addicting Info mixture of true and false 1516 3451 460
mostly false 1891 4004 704
mostly true 1005 2800 368
no factual content 2766 4520 227
Occupy Democrats mixture of true and false 26036 28933 2924
mostly false 10603 22854 1426
mostly true 16215 25000 1624
no factual content 55171 54388 4959
The Other 98% mixture of true and false 9544 11270 942
mostly false 13738 24557 1051
mostly true 10765 14053 793
no factual content 32588 34363 1092
mainstream ABC News Politics mixture of true and false 76 479 59
mostly true 42 176 67
no factual content 63 159 103
CNN Politics mixture of true and false 239 1327 262
mostly true 181 672 321
no factual content 215 656 346
Politico mixture of true and false 7325 20344 1510
mostly true 156 832 166
no factual content 83 420 54
right Eagle Rising mixture of true and false 1182 707 149
mostly false 1300 926 89
mostly true 263 300 67
no factual content 506 573 49
Freedom Daily mixture of true and false 1377 2489 553
mostly false 3390 4651 576
mostly true 2543 3692 480
no factual content 2712 5089 390
Right Wing News mixture of true and false 1472 2465 448
mostly false 2432 3999 434
mostly true 668 1602 291
no factual content 7713 9705 360

Engagement by post type

Medians:


In [24]:
type_grp[ENGAGEMENT_COLS].median().round()


Out[24]:
share_count reaction_count comment_count
Category Page Post Type
left Addicting Info link 563 2195 262
photo 3532 5814 534
video 275 1302 117
Occupy Democrats link 5130 10862 1020
photo 18294 34730 1290
video 26648 30011 2287
The Other 98% link 3391 8836 529
photo 12441 26990 604
video 1598 4751 364
mainstream ABC News Politics link 9 68 27
photo 6 30 14
text 2 5 2
video 37 138 32
CNN Politics link 36 284 177
photo 51 118 88
text 4 95 103
video 125 691 456
Politico link 32 290 92
photo 8 62 27
text 2 58 26
video 48 428 132
right Eagle Rising link 70 154 20
photo 569 750 24
video 38 73 14
Freedom Daily link 964 2263 224
text 3 47 7
Right Wing News link 246 824 87
photo 8474 12115 180

Averages:


In [25]:
type_grp[ENGAGEMENT_COLS].mean().round()


Out[25]:
share_count reaction_count comment_count
Category Page Post Type
left Addicting Info link 1168 3028 391
photo 6021 7128 578
video 275 1302 117
Occupy Democrats link 7463 16094 1272
photo 25268 43735 1604
video 97767 62746 9531
The Other 98% link 10341 13847 882
photo 28875 38285 924
video 16712 9021 959
mainstream ABC News Politics link 23 124 40
photo 9 51 21
text 2 5 2
video 90 273 125
CNN Politics link 176 604 256
photo 77 212 112
text 4 95 103
video 237 1088 663
Politico link 181 879 168
photo 77 299 31
text 2 58 26
video 201 1132 203
right Eagle Rising link 454 411 81
photo 1461 1151 80
video 38 94 31
Freedom Daily link 2497 3718 521
text 3 47 7
Right Wing News link 999 2048 358
photo 11664 12928 408

Shares by factual vs. no factual content


In [26]:
grp = posts.groupby([ "Category", "Page", posts["Rating"] == "no factual content" ])
pd.DataFrame({
    "median": grp["share_count"].median(),
    "average": grp["share_count"].mean()
}).round()\
    .unstack().stack(level=0).rename(columns={True: "no factual content", False: "factual content"})


Out[26]:
Rating factual content no factual content
Category Page
left Addicting Info average 1150 2766
median 578 399
Occupy Democrats average 18155 55171
median 7997 18345
The Other 98% average 10820 32588
median 3484 10337
mainstream ABC News Politics average 43 63
median 13 38
CNN Politics average 182 215
median 48 64
Politico average 183 83
median 33 48
right Eagle Rising average 656 506
median 75 250
Freedom Daily average 2466 2712
median 947 2025
Right Wing News average 1127 7713
median 246 4933

Shares for mostly-true vs. others for partisan pages


In [27]:
grp = posts.groupby([ "Category", "Page", posts["Rating"] == "mostly true" ])
pd.DataFrame({
    "median": grp["share_count"].median(),
    "average": grp["share_count"].mean()
}).round()\
    .unstack().stack(level=0).rename(columns={True: "mostly true", False: "everything else"})\
    [[ "mostly true", "everything else" ]].loc[["left", "right"]]


Out[27]:
Rating mostly true everything else
Category Page
left Addicting Info average 1005 1894
median 523 882
Occupy Democrats average 16215 41812
median 7755 13330
The Other 98% average 10765 26432
median 2896 8236
right Eagle Rising average 263 886
median 46 201
Freedom Daily average 2543 2407
median 908 1142
Right Wing News average 668 2215
median 91 568