Find articles with PubPeer comments into your Mendeley database

PubPeer is a tool that allow researchers to comment published articles. It's not about denunciation ! At least I don't see it that way, it's about post publications reviewing and should well taken by all researchers.

I think post published articles reviewing is healthy exercise and should be encouraged. According to this blog post from PubPeer, some article definitely needs to be reviewed after publication (see the blog post for more details, some are really suprising).

Moreover PubPeer is not always about pointing out some unintentional mistakes or intentional bad behaviours but can also be a starting point for a fruitful discussion between authors and readers.

Let's go to the fun part now : how can I find in my Mendeley database articles with comments on PubPeer ?

Note 1: I did a Python script but it could be really easy to build a small web app and make the process a lot of easier for everyone access.

Note 2: I just discovered https://peerlibrary.org/ which seems to do a similar job as PubPeer and they also have an API !

Any feedback are welcome !

@hadim_


In [1]:
%matplotlib inline

import requests
from lxml import html

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import tqdm

from mendeley import Mendeley
from requests_oauthlib import OAuth2Session
from mendeley.session import MendeleySession

Get Mendeley API auth parameters

First you will need to generate a client ID and client secret from there : http://dev.mendeley.com/myapps.html.

Then, put your client ID in client secret here:


In [2]:
client_id = "1988"
client_secret = "CXhCJQKZ8HUrtFtg"

Note that they are my personal credentials and at the time you read it they will be obsolete.

Now let's start the auth process to the Mendeley API.


In [3]:
redirect_uri = 'https://localhost'

authorization_base_url = "https://api.mendeley.com/oauth/authorize"
token_url = "https://api.mendeley.com/oauth/token"

oauth = OAuth2Session(client_id, redirect_uri=redirect_uri, scope=['all'])

authorization_url, state = oauth.authorization_url(authorization_base_url,
                                                   access_type="offline",
                                                   approval_prompt="force")

print('Please go to {} and authorize access.'.format(authorization_url))


Please go to https://api.mendeley.com/oauth/authorize?response_type=code&client_id=1988&redirect_uri=https%3A%2F%2Flocalhost&scope=all&state=3sX7ggAfEip4OxnGff9pOfYqNb1BTM&access_type=offline&approval_prompt=force and authorize access.

Now paste the fallback url here :


In [5]:
authorization_code = "https://localhost/?code=6fBBP91iqtnu-xPdTlsqCDVroYA&state=3sX7ggAfEip4OxnGff9pOfYqNb1BTM"

Authenticate


In [6]:
token = oauth.fetch_token(token_url, authorization_response=authorization_code, client_secret=client_secret)

mendeley = Mendeley(client_id, client_secret, redirect_uri=redirect_uri)
session = MendeleySession(mendeley, token=token)

Iterate over all your articles and record them into a Pandas Dataframe


In [19]:
articles = []

all_documents = session.documents.list()
for doc in tqdm.tqdm(session.documents.iter(), total=all_documents.count):
    
    if doc.identifiers:
        d = {}
        d['title'] = doc.title
        d['year'] = doc.year
        d['source'] = doc.source
        d['doi'] = doc.identifiers['doi'] if 'doi' in doc.identifiers.keys() else None
        d['pmid'] = doc.identifiers['pmid'] if 'pmid' in doc.identifiers.keys() else None

        if doc.authors:
            authors = ["{}, {}".format(author.first_name, author.last_name) for author in doc.authors]
            d['authors'] = " - ".join(authors)

        articles.append(d)
        
articles = pd.DataFrame(articles)

print("You have {} articles with correct identifiers (DOI or PMID)".format(articles.shape[0]))


                                                   
You have 181 articles with correct identifiers (DOI or PMID)

Lets find matches with PubPeer


In [20]:
pd.options.mode.chained_assignment = None

#articles.loc[0, 'doi'] = "10.5772/22496"
articles['comments'] = 0
articles['comments_link'] = None

old_n = -1
for i in range(1, 179):
    
    print(i)
    url = "http://api.pubpeer.com/v1/publications/dump/{page}?devkey=9bb8f08ebef172ec518f5a4504344ceb"
    r = requests.get(url.format(page=i))
    
    all_pub = r.json()['publications']
    if all_pub:
        for pp in all_pub:
            if 'doi' in pp.keys():
                articles.loc[:, 'comments'][articles['doi'] == pp['doi']] += 1
                articles.loc[:, 'comments_link'][articles['doi'] == pp['doi']] = pp['link']

                n_comm = (articles['comments'] >= 1).sum()
                if n_comm > 0 and n_comm > old_n:
                    print("Commented articles = {}".format(n_comm))
                    old_n = n_comm


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178

In [21]:
articles[articles['comments'] >= 1]


Out[21]:
authors doi pmid source title year comments comments_link