Find articles with PubPeer comments into your Mendeley database

PubPeer is a tool that allow researchers to comment published articles. It's not about denunciation ! At least I don't see it that way, it's about post publications reviewing and should well taken by all researchers.

I think post published articles reviewing is healthy exercise and should be encouraged. According to this blog post from PubPeer, some article definitely needs to be reviewed after publication (see the blog post for more details, some are really suprising).

Moreover PubPeer is not always about pointing out some unintentional mistakes or intentional bad behaviours but can also be a starting point for a fruitful discussion between authors and readers.

Let's go to the fun part now : how can I find in my Mendeley database articles with comments on PubPeer ?

Note 1: I did a Python script but it could be really easy to build a small web app and make the process a lot of easier for everyone access.

Note 2: I just discovered https://peerlibrary.org/ which seems to do a similar job as PubPeer and they also have an API !

Any feedback are welcome !

@hadim_



In [1]:

    
%matplotlib inline

import requests
from lxml import html

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import tqdm

from mendeley import Mendeley
from requests_oauthlib import OAuth2Session
from mendeley.session import MendeleySession

Get Mendeley API auth parameters

First you will need to generate a client ID and client secret from there : http://dev.mendeley.com/myapps.html.

Then, put your client ID in client secret here:



In [2]:

    
client_id = "1988"
client_secret = "CXhCJQKZ8HUrtFtg"

Note that they are my personal credentials and at the time you read it they will be obsolete.

Now let's start the auth process to the Mendeley API.



In [3]:

    
redirect_uri = 'https://localhost'

authorization_base_url = "https://api.mendeley.com/oauth/authorize"
token_url = "https://api.mendeley.com/oauth/token"

oauth = OAuth2Session(client_id, redirect_uri=redirect_uri, scope=['all'])

authorization_url, state = oauth.authorization_url(authorization_base_url,
                                                   access_type="offline",
                                                   approval_prompt="force")

print('Please go to {} and authorize access.'.format(authorization_url))









    



Please go to https://api.mendeley.com/oauth/authorize?response_type=code&client_id=1988&redirect_uri=https%3A%2F%2Flocalhost&scope=all&state=3sX7ggAfEip4OxnGff9pOfYqNb1BTM&access_type=offline&approval_prompt=force and authorize access.

Now paste the fallback url here :



In [5]:

    
authorization_code = "https://localhost/?code=6fBBP91iqtnu-xPdTlsqCDVroYA&state=3sX7ggAfEip4OxnGff9pOfYqNb1BTM"

Authenticate



In [6]:

    
token = oauth.fetch_token(token_url, authorization_response=authorization_code, client_secret=client_secret)

mendeley = Mendeley(client_id, client_secret, redirect_uri=redirect_uri)
session = MendeleySession(mendeley, token=token)

Iterate over all your articles and record them into a Pandas Dataframe



In [19]:

    
articles = []

all_documents = session.documents.list()
for doc in tqdm.tqdm(session.documents.iter(), total=all_documents.count):
    
    if doc.identifiers:
        d = {}
        d['title'] = doc.title
        d['year'] = doc.year
        d['source'] = doc.source
        d['doi'] = doc.identifiers['doi'] if 'doi' in doc.identifiers.keys() else None
        d['pmid'] = doc.identifiers['pmid'] if 'pmid' in doc.identifiers.keys() else None

        if doc.authors:
            authors = ["{}, {}".format(author.first_name, author.last_name) for author in doc.authors]
            d['authors'] = " - ".join(authors)

        articles.append(d)
        
articles = pd.DataFrame(articles)

print("You have {} articles with correct identifiers (DOI or PMID)".format(articles.shape[0]))









    



                                                   





    



You have 181 articles with correct identifiers (DOI or PMID)

Lets find matches with PubPeer



In [20]:

    
pd.options.mode.chained_assignment = None

#articles.loc[0, 'doi'] = "10.5772/22496"
articles['comments'] = 0
articles['comments_link'] = None

old_n = -1
for i in range(1, 179):
    
    print(i)
    url = "http://api.pubpeer.com/v1/publications/dump/{page}?devkey=9bb8f08ebef172ec518f5a4504344ceb"
    r = requests.get(url.format(page=i))
    
    all_pub = r.json()['publications']
    if all_pub:
        for pp in all_pub:
            if 'doi' in pp.keys():
                articles.loc[:, 'comments'][articles['doi'] == pp['doi']] += 1
                articles.loc[:, 'comments_link'][articles['doi'] == pp['doi']] = pp['link']

                n_comm = (articles['comments'] >= 1).sum()
                if n_comm > 0 and n_comm > old_n:
                    print("Commented articles = {}".format(n_comm))
                    old_n = n_comm



In [21]:

    
articles[articles['comments'] >= 1]









    Out[21]:






  
    
      
      authors
      doi
      pmid
      source
      title
      year
      comments
      comments_link