In [2]:
# Some code to generate a list of pubmed IDs
# of all studies included in the CDSR
# with associated quality scores AND direct quotes

import biviewer
import re
import progressbar

In [3]:
bv = biviewer.BiViewer()

In [4]:
len(bv)


Out[4]:
52454

In [5]:
# search through all the Cochrane data for quality assessments
# which the authors have given a direct quote justifying their decision
# and add the pubmed id to the search list

p = progressbar.ProgressBar(len(bv), timer=True)

pmids = []
domain_names = []

for i, study in enumerate(bv):
    p.tap()
    quality_data = study.cochrane["QUALITY"]
    for domain in quality_data:
        pmids.append(study.pubmed["pmid"])
        domain_names.append(domain['DOMAIN'])
        if re.match("Quote\:\s*[\'\"](.*?)[\'\"]\s*[\'\"]", domain['DESCRIPTION']):
            print domain['DESCRIPTION']


[                    ] 0% - Calculating timeQuote: "Randomisation was conducted by an investigator not involved in assessment or intervention …" "Once baseline assessments were completed by the research assistant (RA), participants were then allocated in order of completion from the generated lists by the blinded investigator."
Quote: "… randomized into intervention (n = 1718) and control (n = 1714) groups by an based on simple randomization. " "The subjects were informed by letter to which group they were randomized. The letter contained information concerning the trial and the prescription for the intervention." Unclear whether person sending letter was blind to allocation.
Quote: "Participants in both group were provided with a log for recording falls and details surrounding them." "All participants received monthly follow-up phone calls from the blinded outcome assessor."
Quote: "Computer-generated block randomization stratified for age performed by an independent statistician." "The allocation sequence and group assignment were performed by the Institute of Biometry and Epidemiology. Participants were enrolled by the Institute of Medical Physics."
Quote: "The study was blinded for the outcome assessors and participants ..." "To blind the participants, the control group performed a program that focused on well-being and was designed not to cause physical adaptations"  "The effectiveness of the blinding in the control group was proven in structured interviews conducted by the primary investigators at the end of the 18 months"
Quote: "Ascertainment of falls ... documented on monthly calendars that were returned in prepaid preaddressed envelopes at the end of each month." "A research assistant who was not blinded to treatment group but was unaware of the study hypotheses made three attempts by telephone to contact participants at the end of each month. The purpose of each phone call was to inquire about falls (both groups) ... for all participants regardless of whether the calendar was returned."
Quote: "When a fall or fracture was indicated, a standardized questionnaire recording details was administered by telephone." "The participants and study staff were blinded to intervention group"
Quote: "Nonresponders were contacted over the telephone so that the fall history for the missing calendar weeks and underlying reasons for their lack of response could be assessed" "The assessors (M.F.R. and M.C.F.) were blinded."
Quote: "Participants in all three groups were assessed in their homes for outcomes (see below) at baseline, 2 months, and 5 months by an RA blinded to their group allocation." "Number of falls was recorded using the Falls Record Checklist (Huang & Acton 2004), which has a calendar for participants to circle dates when a fall occurred." Unclear whether falls were recorded concurrently or retrospectively at 2-month and 5-month assessments. No regular telephone follow-up described.
Quote: "During the 12-week period, participants sent in postcards weekly with information pertaining to insole comfort, hours of wear, and falls." "There was 100% compliance in completing the weekly reports."
Quote: "Falls, fall-related injuries, and hospital readmissions were assessed by monthly telephone calls and a patient diary." "All admission records were reviewed by 3 blinded coinvestigators (H.A.B.-F., A.E., and N.J.M.) to determine the main cause of readmission." Hip fracture data used in this review and these participants would have been hospitalised and therefore confirmed by blinded coinvestigators.
Quote: "Occurrence of falls and mortality were assessed by self report of patients and family caregivers." "All subjects were assessed at 1, 3, 6, 12, 18, and 24 months after discharge". No mention of concurrent collection of data and recall appears to be over periods longer than 1 month.
Quote: "For 1 year, the participants recorded each week whether they had fallen." 
 "Once per 3 months the participants return a calendar sheet by mail. When no sheet is received, or when the sheet is completed incorrectly, we inquire by telephone whether, and if yes, when the participant has fallen in the past 3 months."
Quote: "Calendars that were not returned within 2 weeks of the end of the month prompted a telephone contact from an independent, blinded interviewer to ascertain whether the participant had fallen." "All reported falls were followed up with a blinded, structured telephone interview to investigate the circumstances and consequences." "Staff of the York Trials Unit inputted questionnaire data which was checked twice for accuracy."
[                    ] 0% - Calculating timeQuote: "double-blind"; ''the local project director, field workers, obstetrics, laboratory technologists and pregnant women were all blind to the group assignment"; and "placebo tablets were similar in appearance, smell and taste to the mebendazole tablets". 
 Comment: blinding of participants, study personnel and outcome assessors probably done.
[                    ] 1% - Around 5 minutes remaining     Quote: "We used telephone allocation ... During the trial follow-up, however, we were informed that on at least one occasion the allocation concealment was not maintained by the telephone service because both the leg ulcer clinic and the telephone service were extremely busy; therefore the allocations for two patients were supplied "in order to save time"". 
 Comment: It is unclear how often this occurred.
[                    ] 4% - Around 3 minutes remaining     Quote: "Subjects were randomized via computer-generated lists," "Within each institution … in blocks of eight."
Quote: "The subjects … were randomly assigned to an exercise group or a control group ... Since the study was carried out in two separate places, the randomization was done in blocks." "Randomisation was carried out by drawing lots."
Quote: "We included all patients in study wards during each three month study period." "Randomisation of each matched pair of wards was usually done during the week before the study started for that pair of wards. Randomisation involved sealed, opaque envelopes and was supervised by a study investigator ... unaware of ward characteristics."
Quote: "All PCO demographic data were forwarded to the Department of Health Science at the University of York for randomisation and allocation." "The allocation was undertaken by an independent researcher."
Quote: "The assessment staff was blinded to participant randomization assignment. Participants were... reminded not to discuss their randomization assignment with assessment staff." 
 "An independent researcher was in charge of auditing all nursing and medical records to record the number of falls in each participant over the study period"
[=                   ] 5% - Around 3 minutes remaining     Quote: "The solutions were made visibly identical by adding methylene blue to the TLE solution so that it matched the intrinsic blue color of TAC" 
 "The vials, with the specific contents unknown to the emergency physician, were forwarded to the ED as requested" 
 Comment: Probably done
Quote: "Using a prospective, randomized, double-blind design..." 
 "Anesthetic agents were sealed in envelopes labelled with a study identification number and stored in a locked cabinet in the emergency department". 
 Comment: Probably done, assuming topical solutions visually identical.
Quote: "TAC and cocaine solutions were randomly distributed with only a number from 1-150 appearing on each vial" 
 "The investigator was blinded as to the identity of the agent. The code was kept in the pharmacy and was available to the investigators only in case of emergency" 
 Comment: Unlear. There may be allocation concealment if there was a pharmacy controlled randomization process. However, this is not explicitly reported, so we decided upon unclear
Quote: "Both TAC and LET solutions are aqueous and have the same blue tint and viscosity" 
 "labelled to ensure appropriate blindness of suture personnel" 
 "A double blind topical application using 3ml of the test solutions was performed in study entry" 
 Comment: Probably done
Quote: "Using a prospective, randomized, double-blind design..." 
 "Anesthetics were sealed in envelopes labelled with a study identification number and stored in a locked cabinet in the ED" 
 Comment: Probably done, assuming solutions visually identical
Quote: "Randomization of anaesthetic treatment was determined by the final digit of the patients medical record number" 
 "unblinded study" 
 Comment: Probably not done
Quote: "hospital pharmacy personnel to label standard amber vials from 1 to 200" 
 "it was required that the study medication be applied by a nurse not involved in the suturing" 
 Comment: Probably done
Quote: "consecutively assigned to receive either conventional intradermal lignocaine or topical AC preparation" 
 "groups could not be blinded" 
 Comment: Probably not done
Quote: "The solutions were prepared by a pharmacist and were available in coded sterile, capped 3ml syringes" 
 "Both TAC and LAT were clear solutions..." 
 "Patients and physicians performing wound closure were blinded" 
 Comment: Probably done.
Quote: "The solutions were prepared by a pharmacist and were available in coded sterile, capped 3ml syringes with a cotton ball for application"  
 "Both TAC and LAT were clear solutions mixed from powders" 
 "Patients and physicians performing wound closure were blinded" 
 Comment: Probably done
[=                   ] 8% - Around 2 minutes remaining     Quote: ''Randomisation to one of the four arms (1:1:1:1) was stratified by age (<32 or ≥ 32 years) and by centre using a fixed block size of four and a minimization algorithm combined with randomly permutated blocks.''
Quote: ''Randomisation by using a central remote allocation procedure''
Quote: ''Randomisation to one of the two arms (1:1 ratio) was done per centre and stratified by age (<32 or ≥ 32 years) by using randomly permutated blocks with a 'undisclosed' fixed block size of four.''
Quote: ''central remote allocation''
Quote: ''Randomisation to one of the two treatment groups in a 2:1 ratio (investigational:reference group) was performed at each centre and stratified by age (<32 or ≥ 32 years) and planned fertilisation procedure (IVF or ICSI) by central remote allocation using randomly permutated blocks with an 'undisclosed' fixed block size of three.''
Quote: ''Randomisation by central remote allocation''
[==                  ] 13% - Around 2 minutes remaining     Quote:  "using a computer software routine supplied by the sponsor ..." " Within strata subjects were randomly assigned into two groups." 
 
 Comment:  Probably done.
[===                 ] 18% - Around 2 minutes remaining     Quote: "Subjects were unaware of the group to which they had been assigned" "An independent evaluator, who was unaware of the group to which the subject had been assigned, recorded subject performance on various measures before the assigned treatment was initiated, after 4 weeks of treatment, and after 8 weeks of treatment" Comment: blinding of key study personnel and participants was recorded by the study authors, who stated it was double blind Although the authors stated that participants were unaware of the group to which they were allocated (and later states the study was double blind), it could have been possible for participants to determine which group they were in due to the nature of the interventions Overall the judgement is that key assessment personnel were blind but not necessarily the participants
[===                 ] 19% - Around 2 minutes remaining     
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-5-10cee94c9752> in <module>()
      8 domain_names = []
      9 
---> 10 for i, study in enumerate(bv):
     11     p.tap()
     12     quality_data = study.cochrane["QUALITY"]

/Users/iain/Code/py/cochrane-nlp/cochrane-nlp/biviewer.pyc in __getitem__(self, key)
     97             else:
     98                 # else load from file, save to end of cache, and delete oldest cached review
---> 99                 cr = RM5(COCHRANE_REVIEWS_PATH + study['cdsr_filename']).refs(full_parse=True, return_dict=True)
    100                 self.cdsr_cache_data[study['cdsr_filename']] = cr # save to cache
    101                 self.cdsr_cache_index.append(study['cdsr_filename']) # and add to index deque

/Users/iain/Code/py/cochrane-nlp/cochrane-nlp/rm5reader.pyc in __init__(self, filename)
     13 
     14     def __init__(self, filename):
---> 15         XMLReader.__init__(self, filename)
     16         self.section_map["title"] = "COVER_SHEET/TITLE"
     17 

/Users/iain/Code/py/cochrane-nlp/cochrane-nlp/xmlbase.pyc in __init__(self, filename)
     16     def __init__(self, filename=None):
     17         self.filename = filename
---> 18         self.data = self.load(filename)
     19         self.section_map = {}
     20 

/Users/iain/Code/py/cochrane-nlp/cochrane-nlp/xmlbase.pyc in load(self, filename)
     20 
     21     def load(self, filename):
---> 22         return ET.parse(filename)
     23 
     24     def _ET2unicode(self, ET_instance):

<string> in parse(source, parser)

<string> in parse(self, source, parser)

KeyboardInterrupt: 

In [5]:
done = []
with open("data/domain_names.txt", "rb") as f:
    for l in f:
        if l:
            done.append(l.strip())
done = set(done)

In [6]:
domain_names = list(set(domain_names)-set(done))

In [7]:
with open("domain_names2.txt", "wb") as f:
    f.write('\n'.join(domain_names))

In [ ]: