In [101]:
!ls


biologydirect-1-1-19		   jbiol-8-6-54.html
biologydirect-1-1-19.html	   LICENSE
biomedcentral1471-2164-7-151	   parasitesandvectors-3-1-115
biomedcentral1471-2164-7-151.html  parasitesandvectors-3-1-115.html
create_subfolders.sh		   README.md
genomebiology2013-14-6-122	   trying-beautiful-soup.ipynb
genomebiology2013-14-6-122.html    virologyj-11-1-26
get-figures.py			   virologyj-11-1-26.html
jbiol-8-6-54			   whats-wrong-jbiol.ipynb

In [102]:
from bs4 import BeautifulSoup

In [103]:
import os, shutil

Having a look at one of the smaller files:


In [104]:
soup = BeautifulSoup (open("virologyj-11-1-26.html"))

In [105]:
print(soup.prettify())


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html id="nojs" lang="en-GB" xml:lang="en-GB" xmlns="http://www.w3.org/1999/xhtml" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:og="http://ogp.me/ns#" xmlns:wb="“http://open.weibo.com/wb”">
 <head>
  <script>
   window.bmcIsMobile = "classic";
  </script>
  <title>
   Virology Journal | Full text | Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks
  </title>
  <link href="/images/icon/10067/favicon.ico" rel="shortcut icon" type="image/x-icon"/>
  <link href="/sites/10067/images/70.gif" rel="apple-touch-icon" sizes="72x72"/>
  <meta content="IE=7" http-equiv="X-UA-Compatible"/>
  <!-- <?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:cc="http://web.resource.org/cc/" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/">
<cc:Work rdf:about="http://www.virologyj.com/content/11/1/26">
<cc:license rdf:resource="http://creativecommons.org/licenses/by/2.0/"/>
</cc:Work>
<cc:License rdf:about="http://creativecommons.org/licenses/by/2.0/">
<cc:permits rdf:resource="http://web.resource.org/cc/Reproduction"/>
<cc:permits rdf:resource="http://web.resource.org/cc/Distribution"/>
<cc:requires rdf:resource="http://web.resource.org/cc/Notice"/>
<cc:requires rdf:resource="http://web.resource.org/cc/Attribution"/>
<cc:permits rdf:resource="http://web.resource.org/cc/DerivativeWorks"/>
</cc:License>
<item rdf:about="http://www.virologyj.com/content/11/1/26">
<title>Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks</title>
<dc:title>Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks</dc:title>
<dc:creator>Tokarz, Rafal</dc:creator>
<dc:creator>Sameroff, Stephen</dc:creator>
<dc:creator>Leon, Maria S</dc:creator>
<dc:creator>Jain, Komal</dc:creator>
<dc:creator>Lipkin, W I</dc:creator>
<dc:identifier>info:doi/10.1186/1743-422X-11-26</dc:identifier>
<dc:identifier>info:pmid/24517260</dc:identifier>
<dc:source>Virology Journal 2014, 11:26</dc:source>
<dc:date>2014-02-11</dc:date>

<prism:publicationName>Virology Journal</prism:publicationName>
<prism:publicationDate>2014-02-11</prism:publicationDate>
<prism:volume>11</prism:volume>
<prism:number>1</prism:number>
<prism:section>Research</prism:section>
<prism:startingPage>26</prism:startingPage>
<prism:copyright>2014 Tokarz et al.; licensee BioMed Central Ltd.</prism:copyright>
</item>
</rdf:RDF>
 -->
  <meta content="Virology Journal" name="citation_journal_title"/>
  <meta content="BioMed Central Ltd" name="citation_publisher"/>
  <meta content="Rafal Tokarz" name="citation_author"/>
  <meta content="Center for Infection and Immunity, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 1701, New York, NY 10032, USA" name="citation_author_institution"/>
  <meta content="Stephen Sameroff" name="citation_author"/>
  <meta content="Center for Infection and Immunity, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 1701, New York, NY 10032, USA" name="citation_author_institution"/>
  <meta content="Maria S Leon" name="citation_author"/>
  <meta content="Center for Infection and Immunity, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 1701, New York, NY 10032, USA" name="citation_author_institution"/>
  <meta content="Komal Jain" name="citation_author"/>
  <meta content="Center for Infection and Immunity, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 1701, New York, NY 10032, USA" name="citation_author_institution"/>
  <meta content="W I Lipkin" name="citation_author"/>
  <meta content="Center for Infection and Immunity, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 1701, New York, NY 10032, USA" name="citation_author_institution"/>
  <meta content="1743-422X" name="citation_issn"/>
  <meta content="Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks" name="citation_title"/>
  <meta content="11" name="citation_volume"/>
  <meta content="1" name="citation_issue"/>
  <meta content="2014-02-11" name="citation_date"/>
  <meta content="26" name="citation_firstpage"/>
  <meta content="10.1186/1743-422X-11-26" name="citation_doi"/>
  <meta content="http://www.virologyj.com/content/pdf/1743-422X-11-26.pdf" name="citation_pdf_url"/>
  <meta content="24517260" name="citation_pmid"/>
  <link href="http://www.virologyj.com/content/pdf/1743-422X-11-26.pdf" rel="alternate" title="PDF" type="application/pdf"/>
  <meta content="Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks" name="title"/>
  <meta content="Ticks are implicated as hosts to a wide range of animal and human pathogens. The full range of microbes harbored by ticks has not yet been fully explored." name="description"/>
  <meta content="Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks" name="dc.title"/>
  <meta content="Tokarz, Rafal" name="dc.creator"/>
  <meta content="Sameroff, Stephen" name="dc.creator"/>
  <meta content="Leon, Maria S" name="dc.creator"/>
  <meta content="Jain, Komal" name="dc.creator"/>
  <meta content="Lipkin, W I" name="dc.creator"/>
  <meta content="Ticks are implicated as hosts to a wide range of animal and human pathogens. The full range of microbes harbored by ticks has not yet been fully explored." name="dc.description"/>
  <meta content="Virology Journal 2014 11:26" name="dc.source"/>
  <meta content="text/html" name="dc.format"/>
  <meta content="BioMed Central Ltd" name="dc.publisher"/>
  <meta content="2014-02-11" name="dc.date"/>
  <meta content="Research" name="dc.type"/>
  <meta content="10.1186/1743-422X-11-26" name="dc.identifier"/>
  <meta content="info:pmid/24517260" name="dc.identifier"/>
  <meta content="en" name="dc.language"/>
  <meta content="2014 Tokarz et al.; licensee BioMed Central Ltd." name="dc.copyright"/>
  <meta content="http://creativecommons.org/licenses/by/2.0/" name="dc.rights"/>
  <meta content="reprints@biomedcentral.com" name="dc.rightsAgent"/>
  <meta content="1743-422X" name="prism.issn"/>
  <meta content="Virology Journal" name="prism.publicationName"/>
  <meta content="2014-02-11" name="prism.publicationDate"/>
  <meta content="11" name="prism.volume"/>
  <meta content="1" name="prism.number"/>
  <meta content="Research" name="prism.section"/>
  <meta content="26" name="prism.startingPage"/>
  <meta content="2014 Tokarz et al.; licensee BioMed Central Ltd." name="prism.copyright"/>
  <meta content="reprints@biomedcentral.com" name="prism.rightsAgent"/>
  <meta content="http://www.virologyj.com/content/11/1/26/abstract" name="citation_abstract_html_url"/>
  <meta content="http://www.virologyj.com/content/11/1/26" name="citation_fulltext_html_url"/>
  <link href="http://www.virologyj.com/content/download/xml/1743-422X-11-26.xml" rel="alternate" title="XML version" type="text/xml"/>
  <link href="http://www.virologyj.com/content/figures/1743-422X-11-26-toc.gif" rel="image_src"/>
  <script type="text/javascript">
   // must run before css
document.documentElement.id = "js";
  </script>
  <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js">
  </script>
  <script>
   window.bmcIsMobile = "classic";
  </script>
  <link href="/css/main-0.css" media="screen, print" rel="stylesheet" type="text/css"/>
  <link href="/css/plugins-0.css" media="screen, print" rel="stylesheet" type="text/css"/>
  <link href="http://www.bmcimg.com/css/articles-0.css" media="screen, print" rel="stylesheet" type="text/css"/>
  <link href="/css/themes/10067.css" media="screen, print" rel="stylesheet" type="text/css"/>
  <link href="/css/media/print-0.css" media="print" rel="stylesheet" type="text/css"/>
  <!--[if IE]>
                    <link rel="stylesheet" type="text/css" href="/css/hacks/ie-0.css"/>
    <![endif]-->
  <!--[if IE 7]>
                    <link rel="stylesheet" type="text/css" href="/css/hacks/ie7-0.css"/>
    <![endif]-->
  <script type="text/javascript">
   //configuration
    OAS_url = '//oas.biomedcentral.com/RealMedia/ads/';

	// control of advert's position on page [AMC]
    
    	
	
    OAS_sitepage = "virologyj.com/article/10.1186/1743/422x/11/26";
    	
		OAS_listpos = 'Top,x96,Bottom,Right3';
	
	
			
		OAS_query = '';
		
	

    OAS_target = '_top';
    //end of configuration
    OAS_version = 10;
    OAS_rn = '001234567890'; OAS_rns = '1234567890';
    OAS_rn = new String (Math.random()); OAS_rns = OAS_rn.substring (2, 11);
    function OAS_NORMAL(pos) {
		document.write('<a href="' + OAS_url + 'click_nx.ads/' + OAS_sitepage + '/1' + OAS_rns + '@' + OAS_listpos + '!' + pos + '?' + OAS_query + '" target="' + OAS_target + '">');
		document.write('<img src="' + OAS_url + 'adstream_nx.ads/' + OAS_sitepage + '/1' + OAS_rns + '@' + OAS_listpos + '!' + pos + '?' + OAS_query + '" border="0"/></a>');
    }
    
	function refererTerms(referer) {

	if(referer.match("q=") != null) {
		referer = referer.split("q=");
		referer = referer[1].split("&");
		referer = referer[0];
	} else {
		referer = referer;
	}

		if(referer.match("%20") != null && referer.match("%20").length >= 1) {
			referer =  referer.replace(/%20/gi,"+");
			return referer;
		} else { 
			return referer; 
		}
	}
  </script>
 </head>
 <body>
  <div id="oas-campaign" style="display:none">
   virologyj.com/article/10.1186/1743/422x/11/26
  </div>
  <div id="oas-positions" style="display:none">
   Bottom,Top
  </div>
  <script type="text/javascript">
   // Set the version of JavaScript to 10 for a Browser 'Mozilla/3' on webTV 
    OAS_version = 11;
    if ((navigator.userAgent.indexOf('Mozilla/3') != -1) || (navigator.userAgent.indexOf('Mozilla/4.0 WebTV') != -1)) {
     OAS_version = 10; }
    if (OAS_version >= 11) {
    document.write('<scr' + 'ipt type="text/javascript" src="' + OAS_url + 'adstream_mjx.ads/' + OAS_sitepage + '/1' + OAS_rns + '@' + OAS_listpos + '?' + OAS_query + '"><\/script>'); }
  </script>
  <script type="text/javascript">
   document.write('');
	
    function OAS_AD(pos) {
        if (OAS_version >= 11)
         OAS_RICH(pos);
        else
         OAS_NORMAL(pos);
    }
  </script>
  <script type="text/javascript">
   jsonLoaders = function(){
        var _loaders = [];
        return {
            register: function(config){
                config.prefix = config.prefix || '/webapi/1.0/blogs/';
                config.feedLength = config.feedLength || 3;

                _loaders.push(config);
            },
            list: function(){return _loaders}
        };
    }();

    function totext() {
	    document.write('<a href="javascript:returnToText();">Return to text</a>');
    }

    function retrieve_doi(){
        var metas = document.getElementsByTagName('meta');
        var i;
        for (i = 0; i < metas.length; i++)
            if (metas[i].getAttribute('name') == "citation_doi")
                break;
        var citation_doi = metas[i].getAttribute('content');
        return citation_doi;
    }
  </script>
  <script charset="utf-8" src="http://tjs.sjs.sinajs.cn/open/api/js/wb.js" type="text/javascript">
  </script>
  <link exposetoheader="true" href="http://www.virologyj.com/latest/rss" rel="alternate" title="Latest articles" type="application/rss+xml"/>
  <link exposetoheader="true" href="http://www.virologyj.com/mostviewed/rss/" rel="alternate" title="Most viewed" type="application/rss+xml"/>
  <link exposetoheader="true" href="http://www.virologyj.com/latestcomments/rss" rel="alternate" title="Latest comments" type="application/rss+xml"/>
  <script type="text/javascript">
   var _gaq = _gaq || [];
	_gaq.push(['_setAccount', 'UA-9618403-2']);
	_gaq.push(['_gat._anonymizeIp']);
	_gaq.push(['_setLocalRemoteServerMode']);
	
	/*for cross domain tracking*/
	//_gaq.push(['_setAllowLinker', true]);
    //_gaq.push(['_setDomainName', 'virologyj.com']); //this journal's domain
	/**/
	
		_gaq.push(['_trackPageview']); 

	(function() {
		var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
		ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
		var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);

		//var iframe = document.getElementById('casesdb-search-widget');
		//if(iframe !== null) {
		//	var pageTracker = _gat._getTrackerByName();
		//	iframe.src = pageTracker._getLinkerUrl('http://www.casesdatabase.com/');
		//}
	})();
  </script>
  <script>
   window.bmcIsMobile = "classic";
  </script>
  <div id="branding" role="banner">
   <dl class="google-ad wide ">
    <dt class="hide">
     <a class="banner-ad" href="http://www.biomedcentral.com/advertisers/digital_advertising">
     </a>
    </dt>
    <dd>
     <!-- OAS AD 'Top' begin -->
     <script type="text/javascript">
      OAS_AD('Top');
     </script>
     <!-- OAS AD 'Top' end -->
    </dd>
   </dl>
   <noscript>
    <dl class="google-ad wide noscript">
     <dt class="hide">
     </dt>
     <dd>
      <a href="http://oas.biomedcentral.com/RealMedia/ads/click_nx.ads/virologyj.com/article/10.1186/1743/422x/11/26/16812017896@Top?">
       <img alt="advert" src="http://oas.biomedcentral.com/RealMedia/ads/adstream_nx.ads/virologyj.com/article/10.1186/1743/422x/11/26/15094599029@Top?"/>
      </a>
     </dd>
    </dl>
   </noscript>
   <div class="sup-panel-outer">
    <div class="sup-panel-inner">
     <style>
      #nojs li.greeting, #nojs li#loginLink {display:none;}
     </style>
     <ul class="login" id="login">
      <noscript>
       <li>
        <a href="/log">
         <img class="greeting" src="/images/log.gif"/>
        </a>
       </li>
      </noscript>
      <li id="welcome">
       <span id="username">
       </span>
      </li>
      <li id="loginLink">
       <a href="">
       </a>
      </li>
     </ul>
     <script>
      function login() {

/* cookie getter/setter */
var allCookies = {
  getItem: function (sKey) {
    if (!sKey || !this.hasItem(sKey)) { return null; }
    return unescape(document.cookie.replace(new RegExp("(?:^|.*;\\s*)" + escape(sKey).replace(/[\-\.\+\*]/g, "\\$&") + "\\s*\\=\\s*((?:[^;](?!;))*[^;]?).*"), "$1"));
  },
  setItem: function (sKey, sValue, vEnd, sPath, sDomain, bSecure) {
    if (!sKey || /^(?:expires|max\-age|path|domain|secure)$/i.test(sKey)) { return; }
    var sExpires = "";
    if (vEnd) {
      switch (vEnd.constructor) {
        case Number:
          sExpires = vEnd === Infinity ? "; expires=Tue, 19 Jan 2038 03:14:07 GMT" : "; max-age=" + vEnd;
          break;
        case String:
          sExpires = "; expires=" + vEnd;
          break;
        case Date:
          sExpires = "; expires=" + vEnd.toGMTString();
          break;
      }
    }
    document.cookie = escape(sKey) + "=" + escape(sValue) + sExpires + (sDomain ? "; domain=" + sDomain : "") + (sPath ? "; path=" + sPath : "") + (bSecure ? "; secure" : "");
  },
  removeItem: function (sKey, sPath) {
    if (!sKey || !this.hasItem(sKey)) { return; }
    document.cookie = escape(sKey) + "=; expires=Thu, 01 Jan 1970 00:00:00 GMT" + (sPath ? "; path=" + sPath : "");
  },
  hasItem: function (sKey) {
    return (new RegExp("(?:^|;\\s*)" + escape(sKey).replace(/[\-\.\+\*]/g, "\\$&") + "\\s*\\=")).test(document.cookie);
  },
  keys: /* optional method: you can safely remove it! */ function () {
    var aKeys = document.cookie.replace(/((?:^|\s*;)[^\=]+)(?=;|$)|^\s*|\s*(?:\=[^;]*)?(?:\1|$)/g, "").split(/\s*(?:\=[^;]*)?;\s*/);
    for (var nIdx = 0; nIdx < aKeys.length; nIdx++) { aKeys[nIdx] = unescape(aKeys[nIdx]); }
    return aKeys;
  }
};

// converts cookie into js object
function parseCookie(cookie) {

    cookie = cookie.split("&");
    var obj = {};

    for(var i = 0; i < cookie.length; i++) {

        var c = cookie[i].split("=");

        c[0] = c[0].replace('"','');
        
        obj[c[0]] = c[1];

    }
    return obj;
}

    var cookie = allCookies.getItem("bmccookie");

    var obj = (cookie != null) ? parseCookie(cookie) : {};
    

        var welcome = "";

        if(obj.hasOwnProperty("fname")) {

            // insert user welcome
            welcome = obj.fname.replace(/\+/g, " ") + " " + obj['lname'].replace(/\+/g," ");

            if(jQuery("#welcome").length > 0) {
              jQuery("#welcome").prepend("<strong>Welcome</strong> ");
              jQuery("#username").prepend('<strong>' + welcome + '</strong><span class="divider"></span>');
            } 

            // update link url
            jQuery("#loginLink a").text("Log off");
            jQuery("#loginLink a").attr("href", window.location.protocol + "//" + window.location.host + "/logoff");

            window.bmcloggedon = true;


        } else if(obj.hasOwnProperty("institution_name")) {

            // insert institution name
            welcome = obj["institution_name"].replace(/\+/g," ");
            
            if(jQuery("#welcome").length > 0) { 
                jQuery("#welcome").prepend("<strong>Welcome</strong> ");
                jQuery("#username").prepend('<strong>' + welcome + '</strong><span class="divider"></span>');
            } 

            // update link url
            jQuery("#loginLink a").text("Log on");
            jQuery("#loginLink a").attr("href", window.location.protocol + "//" + window.location.host + "/logon");

            window.bmcloggedon = false;


        } else {
            jQuery("#welcome").innerHTML = "";
            jQuery("#loginLink a").text("Log on");
            jQuery("#loginLink a").attr("href", window.location.protocol + "//" + window.location.host + "/logon");

            window.bmcloggedon = false;

        }


        // if athens login cookie doesn't exist and the athens login link is visible then remove it.
        if(allCookies.hasItem("athens") != true && document.getElementById("loginAthensLink") != null) {
          jQuery("#loginAthensLink").hide();
        }

}

login();
     </script>
     <ul class="nav-sup">
      <li class="current" id="BMC">
       <a href="http://www.biomedcentral.com/">
        <span class="img-title">
        </span>
        <span class="text-title">
         BioMed Central
        </span>
       </a>
      </li>
      <li>
       <a href="http://www.biomedcentral.com/journals">
        <span>
         Journals
        </span>
       </a>
      </li>
      <li>
       <a href="http://www.biomedcentral.com/gateways">
        <span>
         Gateways
        </span>
       </a>
      </li>
     </ul>
    </div>
   </div>
   <div class="branding-inner">
    <style>
     .survey {
	margin: 10px 0;
}
.survey, .survey .survey-sentence {
	font-size: 13px;
}
.survey-button a, .survey-button a:active, .survey-button a:visited, .survey-button a:hover  {
    color: #FFFFFF;
    text-decoration: none;
}
.survey-button {
	background-color: #FFA500;
    border-radius: 5px;
    color: #FFFFFF;
    font-weight: bold;
    font-size: 13px;
    font-weight: bold;
    padding: 5px 13px;
    background: #ef8142; /* Old browsers */
	background: -moz-linear-gradient(top,  #ef8142 0%, #ef8142 50%, #a64510 100%, #a64510 100%); /* FF3.6+ */
	background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,#ef8142), color-stop(50%,#ef8142), color-stop(100%,#a64510), color-stop(100%,#a64510)); /* Chrome,Safari4+ */
	background: -webkit-linear-gradient(top,  #ef8142 0%,#ef8142 50%,#a64510 100%,#a64510 100%); /* Chrome10+,Safari5.1+ */
	background: -o-linear-gradient(top,  #ef8142 0%,#ef8142 50%,#a64510 100%,#a64510 100%); /* Opera 11.10+ */
	background: -ms-linear-gradient(top,  #ef8142 0%,#ef8142 50%,#a64510 100%,#a64510 100%); /* IE10+ */
	background: linear-gradient(to bottom,  #ef8142 0%,#ef8142 50%,#a64510 100%,#a64510 100%); /* W3C */
	filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#ef8142', endColorstr='#a64510',GradientType=0 ); /* IE6-9 */

}
    </style>
    <div class="logo">
     <a href="/">
      <img alt="Virology Journal" src="/sites/10067/images/logo.gif" title="Virology Journal Logo"/>
     </a>
    </div>
    <div class="official" id="impact-factor">
     <img alt="official impact factor" src="/images/branding/official.gif" title="2.09"/>
     <span id="impact-factor-value">
      2.09
     </span>
    </div>
    <div class="module search gray small " role="search">
     <div class="module-inner padded">
      <form action=" /quickSearchRedirectURL " class="searchForm" method="get" onsubmit="window.location.href=this['url'].value + '?terms=' + encodeURIComponent(this['terms'].value); return false;">
       <fieldset class="search">
        <span>
         Search
         <select name="url">
          <option selected="" value="http://www.virologyj.com/search/results">
           Virology Journal
          </option>
          <option value="http://www.biomedcentral.com/search/results">
           BioMed Central
          </option>
         </select>
         for
        </span>
        <input alt="Search terms" class="text " id="searchTerms" name="terms" type="text"/>
        <button class="w37" type="submit" value="Go">
         Go
        </button>
       </fieldset>
      </form>
     </div>
     <a class="advanced-search" href="/search">
      Advanced search
     </a>
    </div>
    <div id="mobile-nav-list">
     <ul class="primary-nav" role="navigation">
      <li>
       <a href="/">
        <span id="home-tab">
         Home
        </span>
       </a>
      </li>
      <li class="current">
       <a href="/content">
        <span id="articles-tab">
         Articles
        </span>
       </a>
      </li>
      <li>
       <a href="/authors/instructions">
        <span id="authors-tab">
         Authors
        </span>
       </a>
      </li>
      <li>
       <a href="/about/reviewers">
        <span id="reviewers-tab">
         Reviewers
        </span>
       </a>
      </li>
      <li>
       <a href="/about">
        <span id="about-tab">
         About this journal
        </span>
       </a>
      </li>
      <li>
       <a href="/my">
        <span id="my-tab">
         My Virology Journal
        </span>
       </a>
      </li>
     </ul>
    </div>
   </div>
  </div>
  <hr class="hide"/>
  <div class="container">
   <div class="overflow-visible" id="content" role="main">
    <!--div class="branding-mobile">
		<h1 class="logo">
	    <a href="/">
		    <img alt="Virology Journal" title="Virology Journal Logo" src="/sites/10067/images/logo.gif" />
		</a>
		</h1>
	</div><br/-->
    <div class="outline-wrapper block mobile-hidden" id="left-article-box">
     <ul class="box" id="box-outline">
      <ul class="outline" id="outline">
       <li>
        <a href="#">
         Top
        </a>
       </li>
       <li>
        <a href="#abs">
         Abstract
        </a>
       </li>
       <li>
        <a href="#sec1">
         Background
        </a>
       </li>
       <li>
        <a href="#sec2">
         Results
        </a>
       </li>
       <li>
        <a href="#sec3">
         Discussion
        </a>
       </li>
       <li>
        <a href="#sec4">
         Conclusions
        </a>
       </li>
       <li>
        <a href="#sec5">
         Materials and methods
        </a>
       </li>
       <li>
        <a href="#sec6">
         Competing interests
        </a>
       </li>
       <li>
        <a href="#sec7">
         Authors’ contributions
        </a>
       </li>
       <li>
        <a href="#ack">
         Acknowledgments
        </a>
       </li>
       <li class="lowest">
        <a href="#refs">
         References
        </a>
       </li>
      </ul>
     </ul>
     <dl id="box-outline1">
     </dl>
    </div>
    <div class="mobile-hidden" id="right-panel">
     <div id="mobile-sidebar">
      <div id="mobile-sidebar-tab">
       <span>
        <a href="#" id="mobile-sidebar-tab-link">
         <img src="/images/sidebar-hidden.png"/>
         <img src="/images/sidebar-shown.png" style="display:none;"/>
        </a>
       </span>
      </div>
      <div id="mobile-sidebar-list">
       <div id="article-navigation-bar">
        <div class="issue-information block">
         <h5>
          <a href="http://www.virologyj.com">
           <strong>
            Virology Journal
           </strong>
          </a>
         </h5>
         <ul class="square normal" id="section-name">
          <li>
           <a href="http://www.virologyj.com/sections/negativessrna">
            Negative-strand RNA viruses
           </a>
          </li>
         </ul>
         <ul class="square normal">
          <li>
           <a href="http://www.virologyj.com/content/11">
            Volume 11
           </a>
          </li>
         </ul>
        </div>
        <div class="article-information block" id="article-info">
         <div id="viewing-options-links">
          <h5>
           Viewing options
          </h5>
          <ul class="square normal">
           <li>
            <a href="/content/11/1/26/abstract">
             Abstract
            </a>
           </li>
           <li>
            <strong>
             Full text
            </strong>
           </li>
           <li class="pdfFileSize">
            <a href="/content/pdf/1743-422X-11-26.pdf" onclick="_gaq.push(['_trackEvent', 'PDF download', 'Article Sidebar', '/content/11/1/26', 1, true]);">
             PDF
            </a>
            <span>
             (680KB)
            </span>
           </li>
           <li>
            <a href="/content/epub/1743-422X-11-26.epub">
             ePUB
            </a>
            (112KB)
           </li>
          </ul>
         </div>
         <div id="associated-material-links">
          <h5>
           Associated material
          </h5>
          <ul class="square normal">
           <li>
            <a href="/pubmed/24517260">
             PubMed record
            </a>
           </li>
           <li>
            <a href="/content/11/1/26/about">
             Article metrics
            </a>
           </li>
           <li>
            <a href="/content/11/1/26/comments">
             Readers' comments
            </a>
           </li>
           <li>
           </li>
          </ul>
         </div>
         <div id="related-literature-links">
          <h5>
           Related literature
          </h5>
          <ul class="normal">
           <li>
            <a href="http://www.virologyj.com/content/11/1/26/about#citations">
             Cited by
            </a>
           </li>
           <li id="google-blog-search">
            <a href="http://www.google.com/search?tbm=blg&amp;hl=en&amp;source=hp&amp;biw=1280&amp;bih=899&amp;q=%22Genome+characterization+of+Long+Island+tick+rhabdovirus%2C+a+new+virus+identified+in+Amblyomma+americanum+ticks%22&amp;btnG=Search">
             Google blog search
            </a>
           </li>
           <h6>
            Other articles by authors
           </h6>
           <li>
            <a class="collapser">
             <i>
             </i>
             on Google Scholar
            </a>
            <ul class="hidebeforeload" id="authg">
             <li>
              <a href="http://scholar.google.com/scholar?q=author%3A%22R+Tokarz%22">
               Tokarz R
              </a>
             </li>
             <li>
              <a href="http://scholar.google.com/scholar?q=author%3A%22S+Sameroff%22">
               Sameroff S
              </a>
             </li>
             <li>
              <a href="http://scholar.google.com/scholar?q=author%3A%22MS+Leon%22">
               Leon MS
              </a>
             </li>
             <li>
              <a href="http://scholar.google.com/scholar?q=author%3A%22K+Jain%22">
               Jain K
              </a>
             </li>
             <li>
              <a href="http://scholar.google.com/scholar?q=author%3A%22WI+Lipkin%22">
               Lipkin WI
              </a>
             </li>
            </ul>
           </li>
           <li>
            <a class="collapser">
             <i>
             </i>
             on PubMed
            </a>
            <ul class="hidebeforeload" id="authpm">
             <li>
              <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=Tokarz_R%20[Author]">
               Tokarz R
              </a>
             </li>
             <li>
              <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=Sameroff_S%20[Author]">
               Sameroff S
              </a>
             </li>
             <li>
              <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=Leon_MS%20[Author]">
               Leon MS
              </a>
             </li>
             <li>
              <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=Jain_K%20[Author]">
               Jain K
              </a>
             </li>
             <li>
              <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=Lipkin_WI%20[Author]">
               Lipkin WI
              </a>
             </li>
            </ul>
           </li>
          </ul>
          <h6>
           Related articles/pages
          </h6>
          <ul class="square normal">
           <li>
            <a href="http://www.google.co.uk/search?hl=en&amp;q=related:http://www.virologyj.com/content/11/1/26">
             on Google
            </a>
           </li>
           <li>
            <a href="http://scholar.google.com/scholar?q=related:http://www.virologyj.com/content/11/1/26">
             on Google Scholar
            </a>
           </li>
           <li>
            <a href="/pubmed/related/24517260">
             on PubMed
            </a>
           </li>
          </ul>
         </div>
         <h5>
          Tools
         </h5>
         <ul class="square normal">
          <li>
           <a href="/content/11/1/26/citation">
            Download references
           </a>
          </li>
          <li>
           <a href="/content/download/xml/1743-422X-11-26.xml">
            Download XML
           </a>
          </li>
          <li>
           <a href="/content/11/1/26/email?from=standard">
            Email to a friend
           </a>
          </li>
          <li>
           <a href="https://www.odysseypress.com/onlinehost/reprint_order.php?type=A&amp;page=0&amp;journal=738&amp;doi=10.1186/1743-422X-11-26&amp;volume=11&amp;issue=1&amp;title=Genome+characterization+of+Long+Island+tick+rhabdovirus%2C+a+new+virus+identified+in+Amblyomma+americanum+ticks&amp;author_name=Rafal Tokarz&amp;start_page=1&amp;end_page=5">
            Order reprints
           </a>
          </li>
          <li>
           <a href="/content/11/1/26/postcomment">
            Post a comment
           </a>
          </li>
          <ul id="download-to-links">
           <a class="downloadto_title btn" href="#">
            <span class="icon">
            </span>
            Download to ...
           </a>
           <ul class="iconlist hidebeforeload" id="downloadto" style="display: none;">
            <li>
             <a href="http://redirect.papersapp.com/redirect?url=http%3A%2F%2Fwww.virologyj.com%2Fcontent%2F11%2F1%2F26" onclick="largepopup(this.href,'papers',800,600);return false">
              <span class="share-icons papers">
              </span>
              Papers
             </a>
            </li>
            <li>
             <a href="http://www.mendeley.com/import/?url=http://www.virologyj.com/content/11/1/26" onclick="largepopup(this.href,'mendeley',800,600);return false">
              <span class="share-icons mendeley">
              </span>
              Mendeley
             </a>
            </li>
           </ul>
          </ul>
          <ul class="iconlist" id="mobile-downloadto">
           <h6>
            Download to ...
           </h6>
           <li>
            <a href="http://redirect.papersapp.com/redirect?url=http%3A%2F%2Fwww.virologyj.com%2Fcontent%2F11%2F1%2F26" onclick="largepopup(this.href,'papers',800,600);return false">
             <span class="share-icons papers">
             </span>
             Papers
            </a>
           </li>
           <li>
            <a href="http://www.mendeley.com/import/?url=http://www.virologyj.com/content/11/1/26" onclick="largepopup(this.href,'mendeley',800,600);return false">
             <span class="share-icons mendeley">
             </span>
             Mendeley
            </a>
           </li>
          </ul>
         </ul>
         <h5>
          Share this article
         </h5>
         <ul id="social-networking-links">
          <li>
           <div class="fb-like" data-action="recommend" data-href="http://www.virologyj.com/content/11/1/26" data-layout="button_count" data-send="false" data-show-faces="false" data-width="100">
           </div>
          </li>
          <li>
           <a class="twitter-share-button" data-counturl="http://www.virologyj.com/content/11/1/26" data-hashtags="virologyjournal
" data-text="Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks" data-url="http://www.virologyj.com/content/11/1/26" href="http://twitter.com/share">
            Tweet
           </a>
          </li>
          <li>
           <g:plusone size="medium" width="100">
           </g:plusone>
          </li>
          <li>
           <wb:share-button addition="number" ralateuid="2216240737" title="Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks http://www.virologyj.com/content/11/1/26" type="button">
           </wb:share-button>
          </li>
          <a class="sharethis_title btn" href="#">
           <span class="icon">
           </span>
           More options...
          </a>
          <ul class="iconlist hidebeforeload" id="sharethis">
           <li>
            <a href="http://www.citeulike.org/posturl?url=http://www.virologyj.com/content/11/1/26" onclick="largepopup(this.href,'citeulike_popup_post',800,600);return false">
             <span class="share-icons citeulike">
             </span>
             Citeulike
            </a>
           </li>
           <li>
            <a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=%26%2335%3Bvirologyjournal%0D%0AGenome%20characterization%20of%20Long%20Island%20tick%20rhabdovirus%2C%20a%20new%20virus%20identified%20in%20Amblyomma%20americanum%20ticks%20http%3A%2F%2Fwww.virologyj.com%2Fcontent%2F11%2F1%2F26&amp;title=Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks" onclick="largepopup(this.href,'linkedin',800,600);return false">
             <span class="share-icons linkedin">
             </span>
             LinkedIn
            </a>
           </li>
           <li>
            <a href="http://del.icio.us/post?url=http://www.virologyj.com/content/11/1/26&amp;title=Genome characterization of Long Island tick rhabdovirus, a new virus identified in Amblyomma americanum ticks" onclick="largepopup(this.href,'Del.icio.us',800,600);return false">
             <span class="share-icons delicious">
             </span>
             Del.icio.us
            </a>
           </li>
           <li>
            <a href="/content/11/1/26/email">
             <span class="share-icons email">
             </span>
             Email
            </a>
           </li>
           <li>
            <a href="http://www.facebook.com/sharer.php?u=http://www.virologyj.com/content/11/1/26" onclick="largepopup(this.href,'facebook',800,600);return false">
             <span class="share-icons facebook">
             </span>
             Facebook
            </a>
           </li>
           <li class="gp">
            <div class="googlehider mobile-hide">
             <div class="g-plusone" data-annotation="none">
             </div>
            </div>
            <span class="share-icons googleplus mobile-hide">
            </span>
            <div class="googlehider2 mobile-hide">
             <div class="g-plusone mobile-hide" data-annotation="none">
             </div>
            </div>
            <span class="share-icons googleplus">
            </span>
            <span class="gp-link">
             <a href="#">
              Google+
             </a>
            </span>
           </li>
           <li>
            <a href="http://www.mendeley.com/import/?url=http://www.virologyj.com/content/11/1/26" onclick="largepopup(this.href,'mendeley',800,600);return false">
             <span class="share-icons mendeley">
             </span>
             Mendeley
            </a>
           </li>
           <li>
            <a href="http://twitter.com/?status=Genome%20characterization%20of%20Long%20Island%20tick%20rhabdovirus%2C%20a%20new%20virus%20identified%20in%20Amblyomma%20americanum%20ticks%20http%3A%2F%2Fwww.virologyj.com%2Fcontent%2F11%2F1%2F26+%23virologyjournal
" onclick="largepopup(this.href,'twitter',800,600);return false">
             <span class="share-icons twitter">
             </span>
             Twitter
            </a>
           </li>
          </ul>
         </ul>
        </div>
        <script>
         window.bmcIsMobile = "classic";
        </script>
        <div class="button-collection-margins block mobile-border-bottom" id="signup-to-etoc">
         <h3>
          Email updates
         </h3>
         <p id="receive_issue_alerts_journal_updates_text">
          Keep up to date with the latest news and content from Virology Journal and BioMed Central.
         </p>
         <form action="/signuptoupdates.html" id="articleSignupForUpdates" method="post">
          <div id="message-box">
          </div>
          <fieldset class="block-form">
           <input class="text " id="email" name="emailAddress" onblur="javascript: if (this.value == '') { this.value = 'email address'; }" onfocus="javascript: if (this.value == 'email address') { this.value = ''; }" type="text" value="email address"/>
           <input id="returnUrl" name="returnUrl" type="hidden" value=""/>
           <input id="listName" name="listName" type="hidden" value="xniche-virjou"/>
           <input id="journalName" name="journalName" type="hidden" value="Virology Journal"/>
           <button class="w74 right" journal="" onclick="_gaq.push('_trackEvent', 'Signup', 'Signup for Email updates', " onmouseout="this.className='w74 right'" onmouseover="this.className='w74 hover right'" type="submit" value="Sign up" virology="">
            Sign up
           </button>
          </fieldset>
         </form>
        </div>
        <div id="biome-badge" style="width: 100%">
         <script>
          window.onmessage = function(e) {
                    if(e.data == "biome-failed") {
                        console.log(e);
                        document.getElementById("biome-badge").style.display = "none";
                    }
                };
         </script>
         <iframe height="75px" src="http://www.biomedcentral.com/sites/9001/biome-widget.html?doi=10.1186/1743-422X-11-26&amp;size=large" width="100%">
          Your browser does not support iframes
         </iframe>
        </div>
        <div id="advert-offset">
         <dl class="google-ad ">
          <dt class="hide">
           <a class="skyscraper-ad" href="http://www.biomedcentral.com/advertisers/digital_advertising">
            Advertisement
           </a>
          </dt>
          <dd>
           <!-- OAS AD 'Right3' begin -->
           <script type="text/javascript">
            OAS_AD('Right3');
           </script>
           <!-- OAS AD 'Right3' end -->
          </dd>
         </dl>
         <noscript>
          <dl class="google-ad noscript">
           <dt class="hide">
            <a href="http://www.biomedcentral.com/advertisers/digital_advertising">
             Advertisement
            </a>
           </dt>
           <dd>
            <a href="http://oas.biomedcentral.com/RealMedia/ads/click_nx.ads/virologyj.com/article/10.1186/1743/422X/11/26/16320158491@Right3?">
             <img alt="advert" src="http://oas.biomedcentral.com/RealMedia/ads/adstream_nx.ads/virologyj.com/article/10.1186/1743/422X/11/26/11797354811@Right3?"/>
            </a>
           </dd>
          </dl>
         </noscript>
        </div>
       </div>
       <div class="mobile-sidebar-gradient top">
       </div>
       <div class="mobile-sidebar-gradient bottom">
       </div>
      </div>
     </div>
    </div>
    <div class="rounded custom white article full-text">
     <div class="wrap-inner content">
      <div class="padded-inner">
       <script>
        window.bmcIsMobile = "classic";
       </script>
       <div id="topmatter">
        <a href="/about/access">
         <img alt="Open Access" class="access mr15" src="/images/articles/openaccess-large.png"/>
        </a>
        <a href="http://www.biomedcentral.com/about/mostviewed/">
         <img alt="Highly Accessed" class="access mr15" src="/images/articles/highlyaccessed-large.png"/>
        </a>
        <span class="articletype">
         Research
        </span>
        <h1>
         Genome characterization of Long Island tick rhabdovirus, a new virus identified in
         <em>
          Amblyomma americanum
         </em>
         ticks
        </h1>
        <div class="singleins">
         <p class="authors">
          <strong>
           Rafal Tokarz
          </strong>
          <sup>
           *
          </sup>
          ,
          <strong>
           Stephen Sameroff
          </strong>
          ,
          <strong>
           Maria Sanchez Leon
          </strong>
          ,
          <strong>
           Komal Jain
          </strong>
          and
          <strong>
           W Ian Lipkin
          </strong>
         </p>
         <div id="affiliations">
          <div class="module gray inner">
           <div class="module-inner padded-inner">
            <ul>
             <li>
              <p class="authors">
               <span>
                *
               </span>
               Corresponding author:										            Rafal  Tokarz
               <a href="mailto:rt2249@cumc.columbia.edu">
                rt2249@cumc.columbia.edu
               </a>
              </p>
             </li>
            </ul>
            <p class="options">
             <a class="affiliations-toggle" href="#">
              <i class="arrow">
              </i>
              Author Affiliations
             </a>
            </p>
            <section>
             <div class="collapsible-content">
              <div id="ins_container" style="display: block;">
               <p class="singleInstitute">
                Center for Infection and Immunity, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 1701, New York, NY 10032, USA
               </p>
              </div>
              <p id="authoremails">
               For all author emails, please
               <a href="/logon">
                log on
               </a>
               .
              </p>
             </div>
            </section>
           </div>
          </div>
         </div>
        </div>
        <section class="cit">
         <div class="collapsible-content">
          <p>
           <em>
            Virology Journal
           </em>
           2014,
           <strong>
            11
           </strong>
           :26
           <span class="pseudotab">
            doi:10.1186/1743-422X-11-26
           </span>
          </p>
          <br/>
          <p>
           The electronic version of this article is the complete one and can be found online at:
           <a href="http://www.virologyj.com/content/11/1/26">
            http://www.virologyj.com/content/11/1/26
           </a>
          </p>
          <br/>
          <table cellpadding="0" cellspacing="0">
           <tbody>
            <tr>
             <td>
              Received:
             </td>
             <td>
              15 January 2014
             </td>
            </tr>
            <tr>
             <td>
              Accepted:
             </td>
             <td>
              10 February 2014
             </td>
            </tr>
            <tr>
             <td>
              Published:
             </td>
             <td>
              11 February 2014
             </td>
            </tr>
           </tbody>
          </table>
          <!--<br/>-->
          <div style="line-height:140%">
           <p>
            © 2014 Tokarz et al.; licensee BioMed Central Ltd.
            <br/>
           </p>
           <p>
            This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
            <a href="http://creativecommons.org/licenses/by/2.0">
             http://creativecommons.org/licenses/by/2.0
            </a>
            ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
           </p>
          </div>
         </div>
         <div id="article-body">
          <section>
           <a name="abs">
           </a>
           <h3>
            Abstract
           </h3>
           <div class="collapsible-content">
            <h4>
             Background
            </h4>
            <p style="line-height:160%">
             Ticks are implicated as hosts to a wide range of animal and human pathogens. The full
         range of microbes harbored by ticks has not yet been fully explored.
            </p>
            <h4>
             Methods
            </h4>
            <p style="line-height:160%">
             As part of a viral surveillance and discovery project in arthropods, we used unbiased
         high-throughput sequencing to examine viromes of ticks collected on Long Island, New
         York in 2013.
            </p>
            <h4>
             Results
            </h4>
            <p style="line-height:160%">
             We detected and sequenced the complete genome of a novel rhabdovirus originating from
         a pool of
             <em>
              Amblyomma americanum
             </em>
             ticks. This virus, which we provisionally name Long Island tick rhabdovirus, is distantly
         related to Moussa virus from Africa.
            </p>
            <h4>
             Conclusions
            </h4>
            <p style="line-height:160%">
             The Long Island tick rhabdovirus may represent a novel species within family
             <em>
              Rhabdoviridae
             </em>
             .
            </p>
           </div>
          </section>
          <span id="keywords">
           <h5 class="inline">
            Keywords:
           </h5>
           Ticks; Rhabdovirus; High-throughput sequencing;
           <em>
            Amblyomma americanum
           </em>
          </span>
          <section>
           <a name="sec1">
           </a>
           <h3>
            Background
           </h3>
           <div class="collapsible-content">
            <p style="line-height:160%">
             The family
             <em>
              Rhabdoviridae
             </em>
             consists of a large group of enveloped, single-stranded, negative sense RNA viruses
         that infect a wide range of vertebrates, invertebrates, and plants
             <a name="d43115e184">
             </a>
             [
             <a href="#B1" onclick="LoadInParent('#B1'); return false;">
              1
             </a>
             ]. Their genome typically consists of at least five open reading frames (ORFs) organized
         in a linear order 3′-N-P-M-G-L-5′, and encode the viral nucleocapsid (N), phosophoprotein
         (P), matrixprotein (M), glycoprotein (G) and RNA polymerase (L). In addition to these
         genes, many rhabdoviruses contain smaller ORFs that encode additional accessory proteins,
         most without known function. Currently,
             <em>
              Rhabdoviridae
             </em>
             consists of nine named genera (
             <em>
              Cytorhabdovirus, Ephemerovirus, Lyssavirus, Novirhabdovirus, Nucleorhabdovirus, Perhabdovirus,
            Sigmavirus, Tibrovirus, Vesiculovirus
             </em>
             ), although many tentative rhabdoviruses still await taxonomic classification
             <a name="d43115e194">
             </a>
             [
             <a href="#B2" onclick="LoadInParent('#B2'); return false;">
              2
             </a>
             ].
            </p>
            <p style="line-height:160%">
             Arthropods are essential in transmission of many pathogenic rhabdoviruses. In the
         context of a program in viral surveillance and discovery in arthropods, we examined
         viromes of ticks collected in New York State and identified a novel rhabdovirus associated
         with the lone star tick,
             <em>
              Amblyomma americanum
             </em>
             . We provisionally name this virus Long Island tick rhabdovirus.
            </p>
           </div>
          </section>
          <section>
           <a name="sec2">
           </a>
           <h3>
            Results
           </h3>
           <div class="collapsible-content">
            <p style="line-height:160%">
             Analysis of high-throughput sequencing (HTS) data by BLASTx revealed sequences with
         homology to all five prototypical rhabdovirus proteins. Homology searches indicated
         these sequences were most similar to Moussa virus (MOUV) and therefore all were assembled
         to MOUV as a reference genome
             <a name="d43115e209">
             </a>
             [
             <a href="#B3" onclick="LoadInParent('#B3'); return false;">
              3
             </a>
             ]. Amino acid analysis of the coding sequence suggested this virus likely represented
         a novel rhabdovirus; hence, we tentatively named it Long Island tick rhabdovirus (LITRV),
         after its geographical location and host.
            </p>
            <h4>
             Genome
            </h4>
            <p style="line-height:160%">
             The complete genome of LITRV comprises 11,176 nucleotides (nt), contains non-coding
         3′ and 5′ sequences, and five main ORFs encoded in linear order (Figure
             <a name="d43115e218">
             </a>
             <a href="http://www.virologyj.com/content/11/1/26/figure/F1" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F1','',800,470); return false;">
              1
             </a>
             A). Characteristic of rhabdoviruses, the coding regions are flanked by conserved sequences
         that likely serve as transcription initiation and transcription termination/polyadenylation
         signals. The polyadenylation signal consists of 3′- GAACUUUUUUU, which is followed
         by a 2 nt intergenic sequence and a putative transcription initiation sequence 3′-UUGUU(U/G)N(G/A/U)U
         (Figure
             <a name="d43115e221">
             </a>
             <a href="http://www.virologyj.com/content/11/1/26/figure/F1" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F1','',800,470); return false;">
              1
             </a>
             B). Homology search of all five coding regions revealed that each ORF is most similar
         to the corresponding ORF of MOUV (Table
             <a name="d43115e224">
             </a>
             <a href="http://www.virologyj.com/content/11/1/26/table/T1" onclick="popup('http://www.virologyj.com/content/11/1/26/table/T1','',800,470); return false;">
              1
             </a>
             ). Both MOUV and LITRV cluster together and form a distinct phylogenetic clade within
             <em>
              Rhabdoviridae
             </em>
             (Figure
             <a name="d43115e230">
             </a>
             <a href="http://www.virologyj.com/content/11/1/26/figure/F2" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F2','',800,470); return false;">
              2
             </a>
             ). Similar to MOUV, only ORFs 1, 4 and 5, encoding the putative N, G and L proteins
         display homology to corresponding
             <em>
              Rhabdoviridae
             </em>
             proteins. ORFs 2 and 3, encoding the putative P and M proteins display no homology
         to any rhabdovirus proteins outside of MOUV. The LITRV genome also contains four alternative
         ORFs located within the P, M and G ORFs. These ORFs were designated P’, M’ and G1’
         and G2’, and would encode proteins of 81, 98, 106 and 105 amino acids (aa), respectively,
         with no significant sequence identity with any other
             <em>
              Rhabdoviridae
             </em>
             protein by BLASTp analysis. Of the four ORFs, only P’ and G2’ have an initiation
         codon in suitable context for translation.
            </p>
            <div class="figs">
             <div class="fig">
              <p>
               <a href="http://www.virologyj.com/content/11/1/26/figure/F1" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F1','F1',800,470); return false;">
                <img align="top" alt="thumbnail" class="thumbnail" src="/content/figures/1743-422X-11-26-1.gif"/>
                <strong>
                 Figure 1.
                </strong>
               </a>
               <strong>
                Organization of the LITRV genome. A)
               </strong>
               Schematic representation of LITRV ORFs.
               <strong>
                B)
               </strong>
               LITRV coding regions and corresponding transcription regulatory sequences.
              </p>
             </div>
             <div class="table">
              <p>
               <a href="http://www.virologyj.com/content/11/1/26/table/T1" onclick="popup('http://www.virologyj.com/content/11/1/26/table/T1','T1',800,470); return false;">
                <strong>
                 Table 1.
                </strong>
               </a>
               <strong>
                Comparison of LITRV and MOUV ORFs
               </strong>
              </p>
             </div>
             <div class="fig">
              <p>
               <a href="http://www.virologyj.com/content/11/1/26/figure/F2" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F2','F2',800,470); return false;">
                <img align="top" alt="thumbnail" class="thumbnail" src="/content/figures/1743-422X-11-26-2.gif"/>
                <strong>
                 Figure 2.
                </strong>
               </a>
               <strong>
                Phylogeny of LITRV.
               </strong>
               Maximum likelihood phylogenetic tree based on full length L protein sequences of
         currently recognized species of
               <em>
                Rhabdoviridae
               </em>
               . Gray boxes represent ICTV-accepted species within the genus. LITRV is indicated
         by *. Accession numbers are provided next to the viral names. Due to excessive divergence,
         only the type species for
               <em>
                Cytorhabdovirus
               </em>
               ,
               <em>
                Novirhabdovirus
               </em>
               and
               <em>
                Nucleorhabdovirus
               </em>
               are included. The
               <em>
                Vesiculovirus
               </em>
               piry virus is not included due to limited available sequence.
              </p>
             </div>
            </div>
            <p style="line-height:160%">
             As in all rhabdoviruses, the 3′ and 5′ terminal non-coding regions of LITRV contain
         partially complementary regions, particularly at the terminal ends. The 3′ leader
         sequence is 49 nt long while the length of the 5′ trailer sequence is 115 nt. The
         terminal portions of the non-coding sequences are highly conserved in both LITRV and
         MOUV with 13 out of 19 nt identical in both viruses.
            </p>
            <h4>
             ORF analysis
            </h4>
            <p style="line-height:160%">
             The 1413 nt ORF1 is predicted to encode a nucleoprotein of 470 aa that is 38% similar
         to MOUV N and contains many conserved rhabdovirus N domains including the RNA binding
         motif SPYS (Table
             <a name="d43115e409">
             </a>
             <a href="http://www.virologyj.com/content/11/1/26/table/T1" onclick="popup('http://www.virologyj.com/content/11/1/26/table/T1','',800,470); return false;">
              1
             </a>
             ). ORF2, comprising 831 nt, and, in accordance with its position within the rhabdoviral
         genome, is predicted to encode a 276 aa phosphoprotein. This protein shares 22.3%
         identity with ORF2 of Moussa virus and is predicted to contain 17 serine, 6 threonine
         and 1 tyrosine potential phosphorylation sites (
             <a href="http://www.cbs.dtu.dk/services/NetPhos/">
              http://www.cbs.dtu.dk/services/NetPhos/
             </a>
             <a alt="" class="xpushbutton" href="http://www.webcitation.org/query.php?url=http://www.cbs.dtu.dk/services/NetPhos/&amp;refdoi=10.1186/1743-422x-11-26" title="Archive copy of webpage">
              webcite
             </a>
             ). The 618 nt ORF3 is predicted to encode a 206 aa protein. The protein shares 29.4%
         identity with MOUV ORF3. No other known motifs or domains were recognized, although
         a polyproline region, consisting of 8 consecutive proline residues was identified
         at aa 43–50. ORF 4, comprising 1521 nt, is predicted to encode a 506 aa class I transmembrane
         glycoprotein with 29.4% aa identity to MOUV G protein. The protein sequence contains
         a predicted N terminal 18 aa signal peptidase cleavage site, a hydrophobic trans-membrane
         domain at positions aa 458–480 followed by a C terminal 26 aa tail. It also contains
         5 potential N-glycosylation sites at aa positions 58, 345, 359, 387, 424 and contains
         all 12 cysteine residues conserved in other rhabdoviruses. The 6396 nt L ORF encodes
         the RNA-dependent RNA polymerase, that in LITRV is predicted to encode a 2131 aa protein.
         The LITRV L is 51.5% identical to MOUV and contains all the highly conserved residues
         of negative strand RNA polymerases
             <a name="d43115e416">
             </a>
             [
             <a href="#B4" onclick="LoadInParent('#B4'); return false;">
              4
             </a>
             ].
            </p>
            <p style="line-height:160%">
             To examine the presence of LITRV in ticks, we screened archived cDNA generated from
         50 individual adult
             <em>
              A. americanum
             </em>
             by polymerase chain reaction. The ticks were collected in April of 2008 in the same
         location as the ticks from the current study. LITRV sequence was successfully amplified
         from one tick.
            </p>
           </div>
          </section>
          <section>
           <a name="sec3">
           </a>
           <h3>
            Discussion
           </h3>
           <div class="collapsible-content">
            <p style="line-height:160%">
             <em>
              Rhabdoviridae
             </em>
             is a large family that includes over 100 viruses classified together on the basis
         of genetic, serological or morphological analysis. The advent of high-throughput sequencing
         has resulted in genome characterization of many novel and archived rhabdoviruses revealing
         the vast diversity of this family
             <a name="d43115e433">
             </a>
             <a name="d43115e435">
             </a>
             <a name="d43115e437">
             </a>
             <a name="d43115e439">
             </a>
             <a name="d43115e441">
             </a>
             <a name="d43115e443">
             </a>
             <a name="d43115e445">
             </a>
             [
             <a href="#B3" onclick="LoadInParent('#B3'); return false;">
              3
             </a>
             ,
             <a href="#B5" onclick="LoadInParent('#B5'); return false;">
              5
             </a>
             -
             <a href="#B10" onclick="LoadInParent('#B10'); return false;">
              10
             </a>
             ]. In this study we present the first complete genomic sequence of a tick-associated
         rhabdovirus from the Western hemisphere. Although arthropods are frequently implicated
         as hosts of rhabdoviruses, thus far relatively few of these viruses have been associated
         with ticks
             <a name="d43115e449">
             </a>
             <a name="d43115e451">
             </a>
             <a name="d43115e453">
             </a>
             [
             <a href="#B1" onclick="LoadInParent('#B1'); return false;">
              1
             </a>
             ,
             <a href="#B5" onclick="LoadInParent('#B5'); return false;">
              5
             </a>
             ,
             <a href="#B11" onclick="LoadInParent('#B11'); return false;">
              11
             </a>
             ]. We detected LITRV in
             <em>
              A. americanum,
             </em>
             a common tick species with a broad range spanning eastern and south-central states
         in the US. Along with
             <em>
              Ixodes scapularis
             </em>
             and
             <em>
              Dermacentor variablis, A. americanum
             </em>
             is the primary human-biting hard tick in the eastern part of the US and is implicated
         in transmission of
             <em>
              Ehrlichia
             </em>
             species,
             <em>
              Borrelia lonestari
             </em>
             ,
             <em>
              Francisella tularensis
             </em>
             , and the Heartland virus
             <a name="d43115e476">
             </a>
             <a name="d43115e478">
             </a>
             [
             <a href="#B12" onclick="LoadInParent('#B12'); return false;">
              12
             </a>
             ,
             <a href="#B13" onclick="LoadInParent('#B13'); return false;">
              13
             </a>
             ].
            </p>
            <p style="line-height:160%">
             Additional studies are needed to determine if
             <em>
              A. americanum
             </em>
             is the primary host for LITRV or if other arthropods, and particularly, other ticks,
         have a role in its enzoonotic transmission. While sequences were obtained form
             <em>
              A. americanum
             </em>
             , this does not definitively implicate
             <em>
              A. americanum
             </em>
             as a biological host for LITRV. Detection of microbial nucleic acids in organisms
         other than their presumed biological hosts has been reported and in a hematophagus
         organism these can represent nucleic acid remains of microbes acquired as part of
         a blood meal
             <a name="d43115e493">
             </a>
             <a name="d43115e495">
             </a>
             [
             <a href="#B14" onclick="LoadInParent('#B14'); return false;">
              14
             </a>
             ,
             <a href="#B15" onclick="LoadInParent('#B15'); return false;">
              15
             </a>
             ]. However, our detection of LITRV in
             <em>
              A. americanum
             </em>
             collected at the same site five years apart is consistent with a role for this tick
         species in the life cycle of this virus. Furthermore, deep sequencing analysis of
         multiple pools of
             <em>
              I. scapularis
             </em>
             and
             <em>
              D. variabilis
             </em>
             collected from the same location and did not identify any LITRV-like sequences in
         these two tick species. Our initial survey of LITRV in individual
             <em>
              A. americanum
             </em>
             suggests that the prevalence of this virus may be low in tick populations within
         the examined area. Molecular surveys analyzing different life stages of
             <em>
              A. americanum
             </em>
             and other tick species in broader geographical areas are necessary to establish the
         range and prevalence of LITRV.
            </p>
            <p style="line-height:160%">
             Phylogenetically, LITRV forms a distinct clade with MOUV. MOUV was isolated from multiple
         mosquito pools in Cote d’Ivoire, suggesting that arthropods are the likely hosts to
         this clade of rhabdoviruses. LITRV and MOU share many genetic similarities, including
         identical polyadenylation and terminal regions, as well as similar conserved 3′ and
         5′ termini sequences. The high amino acid divergence of LITRV and MOUV relative to
         other rhabdoviruses suggests that both viruses are unlikely to be associated with
         any of the current ascribed genera and likely represent a unique taxonomic group within
             <em>
              Rhabdoviridae
             </em>
             . We anticipate that continued surveillance of arthropod vectors may uncover other
         members of this clade.
            </p>
           </div>
          </section>
          <section>
           <a name="sec4">
           </a>
           <h3>
            Conclusions
           </h3>
           <div class="collapsible-content">
            <p style="line-height:160%">
             Using high-throughput sequencing analysis of tick viromes, we discovered a novel tick
         associated rhabdovirus. This virus, which we provisionally name Long Island tick rhabdovirus,
         may represent a novel species within family
             <em>
              Rhabdoviridae
             </em>
             .
            </p>
           </div>
          </section>
          <section>
           <a name="sec5">
           </a>
           <h3>
            Materials and methods
           </h3>
           <div class="collapsible-content">
            <h4>
             Nucleic acid extraction
            </h4>
            <p style="line-height:160%">
             Adult ticks were collected in Heckscher State Park (Suffolk County, NY) in April,
         2013. Ticks were pooled and homogenized in 500 μl of phosphate buffered saline. Five
         tick pools were analyzed; one pool of
             <em>
              A. americanum
             </em>
             (N = 25), and two pools each (N = 30/pool) of
             <em>
              I. scapularis
             </em>
             and
             <em>
              D. variablis
             </em>
             . The homogenate was purified through 0.22 μM filter and treated with RNaseA and TurboDNase
         exonucleases. 250 μl of the filtrate was added to 750 μl of NucliSens buffer and total
         nucleic acid (TNA) was extracted with the EasyMag extraction platform (Biomerieux).
         TNA was eluted in 35 μl volume followed by DNase treatment.
            </p>
            <h4>
             Unbiased high throughput sequencing
            </h4>
            <p style="line-height:160%">
             Total nucleic acid was subjected to first and second strand cDNA synthesis with Super
         Script III reverse transcriptase (Invitrogen) and Klenow Fragment (New England Biolabs),
         respectively. Ion Shear™ Plus Reagents Kit (Life Technologies) was used for double
         stranded cDNA fragmentation. Ion Xpress™ Adapters and unique Ion Xpress™ Barcodes
         (Life Technologies) were ligated to fragmented material by using the Ion Plus Fragment
         Library kit, which also contained reagents for amplification of barcoded libraries.
         Ion OneTouch™ 200 Template Kit v2 (Life Technologies) was used to bind barcoded libraries
         to Ion Sphere™ particles (ISPS). Emulsion PCR of DNA linked ISPS was performed on
         the Ion OneTouch™ 2 instrument (Life Technologies). Ion OneTouch™ ES instrument was
         used to isolate template-positive ISPS. Ion PGM™ Sequencing 200 Kit v2 (Life Technologies)
         was used for sequencing of templated ISPS which were loaded on the Ion 316™ Chip for
         further processing on the Ion Personal Genome Machine® (PGM™) System (Life Technologies).
            </p>
            <p style="line-height:160%">
             The de-multiplexed reads were preprocessed by trimming primers and adaptors, length
         filtering, and masking of low complexity regions (WU-BLAST 2.0). The remaining reads
         were subjected to homology search using BLASTn against a host genome database. The
         host-subtracted reads were assembled using the Newbler assembler (454, v2.6). Contigs
         and singletons were then subjected to a homology search against the entire GenBank
         database using BLASTn and the viral GenBank database using BLASTx. Contigs and singletons
         with similarity to viral sequences from the BLASTx analysis were again subjected to
         a homology search against entire GenBank database to correct for biased e-values.
         For potential viral candidates, close relatives were used to identify low homology
         regions in the genome from BLASTx. Overall, out of approximately 190,000 sequence
         reads obtained by HTS, 96 reads with a mean length of 182 nt were unique to LITRV.
         Gaps were filled in by PCR using primers specific to the assembled sequence. The final
         sequence was verified by classical dideoxy sequencing using primers designed to generate
         overlapping PCR products. Genomic termini were obtained by 5′ and 3′ RACE kits (CloneTech).
         Genome assembly was performed with Geneious v 6.1. All phylogenetic trees were constructed
         with Mega 5.2 software.
            </p>
            <h4>
             Tick screening
            </h4>
            <p style="line-height:160%">
             To assess the presence of Long Island tick rhabdovirus in ticks, we used cDNA generated
         from adult
             <em>
              A. americanum
             </em>
             ticks collected in April, 2008 at the same location. Tick cDNA was screened by PCR
         with primers 5′-GGGACGATGCTCTAGTCACG-3′ (fwd), and 5′-TTTGTCTGTGAGGTCGGACG-3′ (rev)
         targeting a 299 bp fragment of the N gene. PCR products were assessed by gel electrophoresis
         and sequenced to confirm that they represent Long Island tick rhabdovirus.
            </p>
            <p style="line-height:160%">
             The complete genome sequence of LITRV was deposited in Genebank under accession number
         KJ396935.
            </p>
           </div>
          </section>
          <section>
           <a name="sec6">
           </a>
           <h3>
            Competing interests
           </h3>
           <div class="collapsible-content">
            <p style="line-height:160%">
             The authors declare that they have no competing interests.
            </p>
           </div>
          </section>
          <section>
           <a name="sec7">
           </a>
           <h3>
            Authors’ contributions
           </h3>
           <div class="collapsible-content">
            <p style="line-height:160%">
             RT and WIL conceived the study, analyzed data and wrote the manuscript. SS and MSL
         performed all assays. KJ performed all bioinformatics analysis. All authors read and
         approved the final manuscript.
            </p>
           </div>
          </section>
          <section>
           <a name="ack">
           </a>
           <h3>
            Acknowledgments
           </h3>
           <div class="collapsible-content">
            <p style="line-height:160%">
             This work was supported by grants from the National Institutes of Health AI057158
         (Northeast Biodefense Center-Lipkin), USAID PREDICT and the Defense Threat Reduction
         Agency.
            </p>
           </div>
          </section>
          <section>
           <a name="refs">
           </a>
           <h3>
            References
           </h3>
           <div class="collapsible-content" id="article-references">
            <ol id="references">
             <li id="B1">
              <p>
               <a name="B1">
               </a>
               Kuzmin IV,  Novella IS,  Dietzge RG,  Padhi A,  Rupprecht CE:
               <strong>
                The rhabdoviruses: biodiversity, phylogenetics, and evolution.
               </strong>
              </p>
              <p>
               <em>
                Infect Genet Evol
               </em>
               2009,
               <strong>
                9
               </strong>
               (4)
               <strong>
                :
               </strong>
               541-553.
               <a href="/pubmed/19460320" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=19460320" target="_blank">
                Publisher Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B1" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B1','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B2">
              <p>
               <a name="B2">
               </a>
               King AMQ,  Adams MJ,  Carstens EB,  Lefkowitz EJ:
               <em>
                Virus Taxonomy; Ninth Report of the International Committee on Taxonomy of Viruses.
               </em>
               Amsterdam, The Netherlands: Elsevier Academic Press;  2012.
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B3">
              <p>
               <a name="B3">
               </a>
               Quan PL,  Junglen S,  Tashmukhamedova A,  Conlan S,  Hutchinson SK,  Kurth A,  Ellerbrok H,  Egholm M,  Briese T,  Leendertz FH,  Lipkin WI:
               <strong>
                Moussa virus: a new member of the Rhabdoviridae family isolated from Culex decens
                  mosquitoes in Cote d’Ivoire.
               </strong>
              </p>
              <p>
               <em>
                Virus Res
               </em>
               2010,
               <strong>
                147
               </strong>
               (1)
               <strong>
                :
               </strong>
               17-24.
               <a href="/pubmed/19804801" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=19804801" target="_blank">
                Publisher Full Text
               </a>
               |
               <a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=19804801" target="_blank">
                PubMed Central Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B3" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B3','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B4">
              <p>
               <a name="B4">
               </a>
               Poch O,  Blumberg BM,  Bougueleret L,  Tordo N:
               <strong>
                Sequence comparison of five polymerases (L proteins) of unsegmented negative-strand
                  RNA viruses: theoretical assignment of functional domains.
               </strong>
              </p>
              <p>
               <em>
                J Gen Virol
               </em>
               1990,
               <strong>
                71
               </strong>
               (Pt 5)
               <strong>
                :
               </strong>
               1153-1162.
               <a href="/pubmed/2161049" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=2161049" target="_blank">
                Publisher Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B4" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B4','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B5">
              <p>
               <a name="B5">
               </a>
               Ghedin E,  Rogers MB,  Widen SG,  Guzman H,  Travassos da Rosa AP,  Wood TG,  Fitch A,  Popov V,  Holmes EC,  Walker PJ,  Vasilakis N,  Tesh RB:
               <strong>
                Kolente virus, a rhabdovirus species isolated from ticks and bats in the Republic
                  of Guinea.
               </strong>
              </p>
              <p>
               <em>
                J Gen Virol
               </em>
               2013,
               <strong>
                94
               </strong>
               (Pt 12)
               <strong>
                :
               </strong>
               2609-2615.
               <a href="/pubmed/24062532" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=24062532" target="_blank">
                Publisher Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B5" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B5','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B6">
              <p>
               <a name="B6">
               </a>
               Kading RC,  Gilbert AT,  Mossel EC,  Crabtree MB,  Kuzmin IV,  Niezgoda M,  Agwanda B,  Markotter W,  Weil MR,  Montgomery JM,  Rupprecht CE,  Miller BR:
               <strong>
                Isolation and molecular characterization of Fikirini rhabdovirus, a novel virus from
                  a Kenyan bat.
               </strong>
              </p>
              <p>
               <em>
                J Gen Virol
               </em>
               2013,
               <strong>
                94
               </strong>
               (Pt 11)
               <strong>
                :
               </strong>
               2393-2398.
               <a href="/pubmed/23939976" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=23939976" target="_blank">
                Publisher Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B6" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B6','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B7">
              <p>
               <a name="B7">
               </a>
               Palacios G,  Forrester NL,  Savji N,  Travassos da Rosa ,  Guzman H,  Detoy K,  Popov VL,  Walker PJ,  Lipkin WI,  Vasilakis N,  Tesh RB:
               <strong>
                Characterization of Farmington virus, a novel virus from birds that is distantly related
                  to members of the family Rhabdoviridae.
               </strong>
              </p>
              <p>
               <em>
                Virol J
               </em>
               2013,
               <strong>
                10
               </strong>
               <strong>
                :
               </strong>
               219.
               <a href="/pubmed/23816310" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://dx.doi.org/10.1186/1743-422X-10-219" target="_blank">
                BioMed Central Full Text
               </a>
               |
               <a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=23816310" target="_blank">
                PubMed Central Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B7" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B7','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B8">
              <p>
               <a name="B8">
               </a>
               Vasilakis N,  Widen S,  Mayer SV,  Seymour R,  Wood TG,  Popov V,  Guzman H,  Travassos da Rosa ,  Ghedin E,  Holmes EC,  Walker PJ,  Tesh RB:
               <strong>
                Niakha virus: a novel member of the family Rhabdoviridae isolated from phlebotomine
                  sandflies in Senegal.
               </strong>
              </p>
              <p>
               <em>
                Virology
               </em>
               2013,
               <strong>
                444
               </strong>
               (1–2)
               <strong>
                :
               </strong>
               80-89.
               <a href="/pubmed/23773405" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=23773405" target="_blank">
                Publisher Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B8" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B8','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B9">
              <p>
               <a name="B9">
               </a>
               Quan PL,  Williams DT,  Johansen CA,  Jain K,  Petrosov A,  Diviney SM,  Tashmukhamedova A,  Hutchinson SK,  Tesh RB,  Mackenzie JS,  Briese T,  Lipkin WI:
               <strong>
                Genetic characterization of K13965, a strain of Oak Vale virus from Western Australia.
               </strong>
              </p>
              <p>
               <em>
                Virus Res
               </em>
               2011,
               <strong>
                160
               </strong>
               (1–2)
               <strong>
                :
               </strong>
               206-213.
               <a href="/pubmed/21740935" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=21740935" target="_blank">
                Publisher Full Text
               </a>
               |
               <a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=21740935" target="_blank">
                PubMed Central Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B9" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B9','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B10">
              <p>
               <a name="B10">
               </a>
               Stone DM,  Kerr RC,  Hughes M,  Radford AD,  Darby AC:
               <strong>
                Characterisation of the genomes of four putative vesiculoviruses: tench rhabdovirus,
                  grass carp rhabdovirus, perch rhabdovirus and eel rhabdovirus European X.
               </strong>
              </p>
              <p>
               <em>
                Arch Virol
               </em>
               2013,
               <strong>
                158
               </strong>
               (11)
               <strong>
                :
               </strong>
               2371-2377.
               <a href="/pubmed/23719670" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=23719670" target="_blank">
                Publisher Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B10" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B10','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B11">
              <p>
               <a name="B11">
               </a>
               Labuda M,  Nuttall PA:
               <strong>
                Tick-borne viruses.
               </strong>
              </p>
              <p>
               <em>
                Parasitology
               </em>
               2004,
               <strong>
                129
               </strong>
               (Suppl)
               <strong>
                :
               </strong>
               S221-S245.
               <a href="/pubmed/15938513" target="_blank">
                PubMed Abstract
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B11" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B11','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B12">
              <p>
               <a name="B12">
               </a>
               Goddard J,  Varela-Stokes AS:
               <strong>
                Role of the lone star tick, Amblyomma americanum (L.), in human and animal diseases.
               </strong>
              </p>
              <p>
               <em>
                Vet Parasitol
               </em>
               2009,
               <strong>
                160
               </strong>
               (1–2)
               <strong>
                :
               </strong>
               1-12.
               <a href="/pubmed/19054615" target="_blank">
                PubMed Abstract
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B12" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B12','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B13">
              <p>
               <a name="B13">
               </a>
               Savage HM,  Godsey MS,  Lambert A,  Panella NA,  Burkhalter KL,  Harmon JR,  Lash RR,  Ashley DC,  Nicholson WL:
               <strong>
                First detection of heartland virus (Bunyaviridae: Phlebovirus) from field collected
                  arthropods.
               </strong>
              </p>
              <p>
               <em>
                Am J Trop Med Hyg
               </em>
               2013,
               <strong>
                89
               </strong>
               (3)
               <strong>
                :
               </strong>
               445-452.
               <a href="/pubmed/23878186" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=23878186" target="_blank">
                Publisher Full Text
               </a>
               |
               <a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=23878186" target="_blank">
                PubMed Central Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B13" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B13','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B14">
              <p>
               <a name="B14">
               </a>
               Telford SR,  Wormser GP:
               <strong>
                Bartonella spp. transmission by ticks not established.
               </strong>
              </p>
              <p>
               <em>
                Emerg Infect Dis
               </em>
               2010,
               <strong>
                16
               </strong>
               (3)
               <strong>
                :
               </strong>
               379-384.
               <a href="/pubmed/20202410" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=20202410" target="_blank">
                Publisher Full Text
               </a>
               |
               <a href="http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&amp;pubmedid=20202410" target="_blank">
                PubMed Central Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B14" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B14','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
             <li id="B15">
              <p>
               <a name="B15">
               </a>
               Whitehouse CA:
               <strong>
                Crimean-Congo hemorrhagic fever.
               </strong>
              </p>
              <p>
               <em>
                Antiviral Res
               </em>
               2004,
               <strong>
                64
               </strong>
               (3)
               <strong>
                :
               </strong>
               145-160.
               <a href="/pubmed/15550268" target="_blank">
                PubMed Abstract
               </a>
               |
               <a href="http://www.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&amp;cmd=prlinks&amp;retmode=ref&amp;id=15550268" target="_blank">
                Publisher Full Text
               </a>
               <a href="/sfx_links?ui=1743-422X-11-26&amp;bibl=B15" onclick="popup('/sfx_links?ui=1743-422X-11-26&amp;bibl=B15','SFXMenu','460','420'); return false;">
                <img align="absmiddle" alt="OpenURL" src="/sfx_links?getImage"/>
               </a>
              </p>
              <p class="totext">
               <script type="text/javascript">
                totext()
               </script>
              </p>
             </li>
            </ol>
           </div>
          </section>
          <br class="clearall"/>
          <div class="article-alert-signup-div">
           <a class="close png_bg" id="closeHelpBox">
            close
           </a>
           <div class="article-alert-signup rounded">
            <span class="blurb left">
             Sign up to receive new article alerts from
             <em>
              Virology Journal
             </em>
            </span>
            <button class="w74 right" name="articleAlertPreference.status" onclick="window.location='/my/preferences'" type="button">
             Sign up
            </button>
           </div>
          </div>
         </div>
        </section>
       </div>
      </div>
     </div>
    </div>
   </div>
   <dl class="google-ad wide ">
    <dt class="hide">
     <a class="banner-ad" href="http://www.biomedcentral.com/advertisers/digital_advertising">
     </a>
    </dt>
    <dd>
     <!-- OAS AD 'Bottom' begin -->
     <script type="text/javascript">
      OAS_AD('Bottom');
     </script>
     <!-- OAS AD 'Bottom' end -->
    </dd>
   </dl>
   <noscript>
    <dl class="google-ad noscript">
     <dt class="hide">
      <a href="http://www.biomedcentral.com/advertisers/digital_advertising">
       Advertisement
      </a>
     </dt>
     <dd>
      <a href="http://oas.biomedcentral.com/RealMedia/ads/click_nx.ads/virologyj.com/article/10.1186/1743/422x/11/26/13887356334@Bottom?">
       <img alt="advert" src="http://oas.biomedcentral.com/RealMedia/ads/adstream_nx.ads/virologyj.com/article/10.1186/1743/422x/11/26/15543041548@Bottom?"/>
      </a>
     </dd>
    </dl>
   </noscript>
   <hr class="hide"/>
   <script>
    window.bmcIsMobile = "classic";
   </script>
   <div id="footer">
    <span class="views">
     <a href="?fmt_view=mobile" id="mobile-view">
      Mobile view
     </a>
     |
     <strong>
      Desktop view
     </strong>
    </span>
    <div class="content">
     <div class="rounded ">
      <div class="wrap-inner ">
       <ul class="desktop">
        <li class="noborder">
         <a href="http://www.biomedcentral.com/about/tandc">
          Terms and Conditions
         </a>
        </li>
        <li>
         <a href="http://www.biomedcentral.com/about/privacy">
          Privacy statement
         </a>
        </li>
        <li>
         <a href="http://www.biomedcentral.com/presscenter">
          Press
         </a>
        </li>
        <li>
         <a href="http://www.biomedcentral.com/advertisers">
          Information for advertisers
         </a>
        </li>
        <li>
         <a href="http://www.biomedcentral.com/about/bmcjobs">
          Jobs at BMC
         </a>
        </li>
        <li>
         <a href="/support">
          Support
         </a>
        </li>
        <li>
         <a href="/about/contact">
          Contact us
         </a>
        </li>
       </ul>
       <p id="copyright">
        © 2014 
	BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.
       </p>
      </div>
     </div>
    </div>
   </div>
   <div class="springer">
    <a href="http://www.springer.com/" target="blank">
     <img src="http://www.biomedcentral.com/images/Springer_Branding_footer_logo.png" style="float:right; margin:0;"/>
    </a>
   </div>
   <script>
    // getElementsByClassName Polyfill
var getElementsByClassName=function(e,t,n){if(document.getElementsByClassName){getElementsByClassName=function(e,t,n){n=n||document;var r=n.getElementsByClassName(e),i=t?new RegExp("\\b"+t+"\\b","i"):null,s=[],o;for(var u=0,a=r.length;u<a;u+=1){o=r[u];if(!i||i.test(o.nodeName)){s.push(o)}}return s}}else if(document.evaluate){getElementsByClassName=function(e,t,n){t=t||"*";n=n||document;var r=e.split(" "),i="",s="http://www.w3.org/1999/xhtml",o=document.documentElement.namespaceURI===s?s:null,u=[],a,f;for(var l=0,c=r.length;l<c;l+=1){i+="[contains(concat(' ', @class, ' '), ' "+r[l]+" ')]"}try{a=document.evaluate(".//"+t+i,n,o,0,null)}catch(h){a=document.evaluate(".//"+t+i,n,null,0,null)}while(f=a.iterateNext()){u.push(f)}return u}}else{getElementsByClassName=function(e,t,n){t=t||"*";n=n||document;var r=e.split(" "),i=[],s=t==="*"&&n.all?n.all:n.getElementsByTagName(t),o,u=[],a;for(var f=0,l=r.length;f<l;f+=1){i.push(new RegExp("(^|\\s)"+r[f]+"(\\s|$)"))}for(var c=0,h=s.length;c<h;c+=1){o=s[c];a=false;for(var p=0,d=i.length;p<d;p+=1){a=i[p].test(o.className);if(!a){break}}if(a){u.push(o)}}return u}}return getElementsByClassName(e,t,n)}
   </script>
   <script type="text/javascript">
    if (typeof jQuery == 'undefined') { document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/jquery.min.js' type='text/javascript'%3E%3C/script%3E")); }


document.documentElement.id = "js";

var site = {
			abbreviation: "",
	        type: "BMC_NICHE",
        portalId: "9001",
                section: "browse",
                page: "fulltext",
    		
									homePortal: "biomedcentral.com",
		
			siteShowAds: "true",
				showAds: "true",
				url: "virologyj.com",
			portal: false,
	        adpage_url: "virologyj.com/article/10.1186/1743/422x/11/26",
        id: "10067",
	name: "Virology Journal",
	plugins: {
		scrollable: (getElementsByClassName("scrollable").length > 0),
				flowplayer: ((document.getElementById("flowplayer") != null) || (getElementsByClassName("myPlayer").length > 0) || (document.getElementById("videoarticle") != null) || (getElementsByClassName("flowplayer").length > 0) || (document.getElementById("audio") != null) || (document.getElementById("elifeplayer") != null)),
		lightbox: ((document.getElementById("image-highlight") != null) || (getElementsByClassName("lightbox").length > 0)),
		thickbox: (getElementsByClassName("thickbox").length > 0),
		altmetric: ((document.getElementById("altmetric") != null) || (getElementsByClassName("altmetric").length > 0))
	}
	
};



if(typeof quoteSearchUri == "undefined") { quoteSearchUri=null; }
   </script>
   <script src="http://www.bmcimg.com/javascript/journals/behaviours-0.js" type="text/javascript">
   </script>
   <script>
    if (site.plugins.scrollable == true) { 
		document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/jquery.scrollable-1.0.1.min.js' %3E%3C/script%3E")); 
		document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/jquery.tools.min.js'%3E%3C/script%3E"));
	}

	if (site.plugins.flowplayer == true) { 
		document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/flowplayer-3.2.6.min.js'%3E%3C/script%3E"));
	}

	if (site.plugins.lightbox == true) { 
		document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/browse_imagehighlight.js'%3E%3C/script%3E"));
		document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/jquery.colorbox.js' %3E%3C/script%3E"));
	}

	if (site.plugins.thickbox == true) { 
		document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/thickbox-compressed.js' %3E%3C/script%3E"));
		document.write(decodeURIComponent("%3Cscript src='/javascript/plugins/thickbox-flashhidden-patch.js' %3E%3C/script%3E"));
	}
   </script>
   <script src="http://www.bmcimg.com/javascript/plugins/jquery.fancybox-1.3.1.pack-0.js" type="text/javascript">
   </script>
   <script src="http://www.bmcimg.com/javascript/plugins/jquery.debug-0.js" type="text/javascript">
   </script>
   <script src="http://www.bmcimg.com/javascript/email_preferences/email_preferences-0.js" type="text/javascript">
   </script>
   <script src="http://www.bmcimg.com/javascript/articles/articles-0.js" type="text/javascript">
   </script>
   <div id="fb-root">
   </div>
   <script src="/javascript/plugins/jquery.tagcloud.min-0.js" type="text/javascript">
   </script>
   <!--[if lte IE 6]>
            <script type="text/javascript" src="/javascript/plugins/DD_belatedPNG_0.0.8a-0.js"></script>
    <script type="text/javascript">
        DD_belatedPNG.fix('.png_bg, ul.primary-nav li a, ul.primary-nav li a span, ul.secondary-nav li a, ul.secondary-nav li a span, ul li i, .plus-button, .facebook, .twitter, .fancy-bg, #fancybox-close');
    </script>
	                <link rel="stylesheet" type="text/css" href="http://www.bmcimg.com/css/hacks/ie6-0.css"/>
    <![endif]-->
   <!-- Claudia Whitcombe wanted this comment here -->
   <div class="hide">
    <dl class="google-ad" id="x96banner">
     <dt class="hide" style="display: block;">
      <a class="skyscraper-ad" href="http://www.biomedcentral.com/advertisers/digital_advertising">
       Advertisement
      </a>
     </dt>
     <dd>
      <!-- OAS AD 'x96' begin -->
      <script type="text/javascript">
       OAS_AD('x96');
      </script>
     </dd>
    </dl>
   </div>
  </div>
 </body>
</html>

I want figure caption text. And also perhaps separately, the URL link to the figure images:

------------ Rendered sample HTML below -----------------

Figure 1. Organization of the LITRV genome. A) Schematic representation of LITRV ORFs. B) LITRV coding regions and corresponding transcription regulatory sequences.

-------------- Rendered HTML above, raw below ----------------

Figure 1. Organization of the LITRV genome. A) Schematic representation of LITRV ORFs. B) LITRV coding regions and corresponding transcription regulatory sequences.


In [106]:
strong = soup.find_all('strong')

In [107]:
for link in strong:
    print link


<strong>Virology Journal</strong>
<strong>Full text</strong>
<strong>Rafal Tokarz</strong>
<strong>Stephen Sameroff</strong>
<strong>Maria Sanchez Leon</strong>
<strong>Komal Jain</strong>
<strong>W Ian Lipkin</strong>
<strong>11</strong>
<strong>Figure 1.</strong>
<strong>Organization of the LITRV genome. A)</strong>
<strong>B)</strong>
<strong>Table 1.</strong>
<strong>Comparison of LITRV and MOUV ORFs</strong>
<strong>Figure 2.</strong>
<strong>Phylogeny of LITRV.</strong>
<strong> The rhabdoviruses: biodiversity, phylogenetics, and evolution. </strong>
<strong>9</strong>
<strong>:</strong>
<strong> Moussa virus: a new member of the Rhabdoviridae family isolated from Culex decens
                  mosquitoes in Cote d’Ivoire. </strong>
<strong>147</strong>
<strong>:</strong>
<strong> Sequence comparison of five polymerases (L proteins) of unsegmented negative-strand
                  RNA viruses: theoretical assignment of functional domains. </strong>
<strong>71</strong>
<strong>:</strong>
<strong> Kolente virus, a rhabdovirus species isolated from ticks and bats in the Republic
                  of Guinea. </strong>
<strong>94</strong>
<strong>:</strong>
<strong> Isolation and molecular characterization of Fikirini rhabdovirus, a novel virus from
                  a Kenyan bat. </strong>
<strong>94</strong>
<strong>:</strong>
<strong> Characterization of Farmington virus, a novel virus from birds that is distantly related
                  to members of the family Rhabdoviridae. </strong>
<strong>10</strong>
<strong>:</strong>
<strong> Niakha virus: a novel member of the family Rhabdoviridae isolated from phlebotomine
                  sandflies in Senegal. </strong>
<strong>444</strong>
<strong>:</strong>
<strong> Genetic characterization of K13965, a strain of Oak Vale virus from Western Australia. </strong>
<strong>160</strong>
<strong>:</strong>
<strong> Characterisation of the genomes of four putative vesiculoviruses: tench rhabdovirus,
                  grass carp rhabdovirus, perch rhabdovirus and eel rhabdovirus European X. </strong>
<strong>158</strong>
<strong>:</strong>
<strong> Tick-borne viruses. </strong>
<strong>129</strong>
<strong>:</strong>
<strong> Role of the lone star tick, Amblyomma americanum (L.), in human and animal diseases. </strong>
<strong>160</strong>
<strong>:</strong>
<strong> First detection of heartland virus (Bunyaviridae: Phlebovirus) from field collected
                  arthropods. </strong>
<strong>89</strong>
<strong>:</strong>
<strong> Bartonella spp. transmission by ticks not established. </strong>
<strong>16</strong>
<strong>:</strong>
<strong> Crimean-Congo hemorrhagic fever. </strong>
<strong>64</strong>
<strong>:</strong>
<strong>Desktop view</strong>

In [108]:
strong = soup.find_all('em')

In [109]:
for link in strong:
    print link


<em>Amblyomma americanum</em>
<em>Virology Journal</em>
<em>Amblyomma americanum</em>
<em>Rhabdoviridae</em>
<em>Amblyomma americanum</em>
<em>Rhabdoviridae</em>
<em>Rhabdoviridae</em>
<em>Cytorhabdovirus, Ephemerovirus, Lyssavirus, Novirhabdovirus, Nucleorhabdovirus, Perhabdovirus,
            Sigmavirus, Tibrovirus, Vesiculovirus</em>
<em>Amblyomma americanum</em>
<em>Rhabdoviridae</em>
<em>Rhabdoviridae</em>
<em>Rhabdoviridae</em>
<em>Rhabdoviridae</em>
<em>Cytorhabdovirus</em>
<em>Novirhabdovirus</em>
<em>Nucleorhabdovirus</em>
<em>Vesiculovirus</em>
<em>A. americanum</em>
<em>Rhabdoviridae</em>
<em>A. americanum,</em>
<em>Ixodes scapularis</em>
<em>Dermacentor variablis, A. americanum</em>
<em>Ehrlichia</em>
<em>Borrelia lonestari</em>
<em>Francisella tularensis</em>
<em>A. americanum</em>
<em>A. americanum</em>
<em>A. americanum</em>
<em>A. americanum</em>
<em>I. scapularis</em>
<em>D. variabilis</em>
<em>A. americanum</em>
<em>A. americanum</em>
<em>Rhabdoviridae</em>
<em>Rhabdoviridae</em>
<em>A. americanum</em>
<em>I. scapularis</em>
<em>D. variablis</em>
<em>A. americanum</em>
<em>Infect Genet Evol</em>
<em>Virus Taxonomy; Ninth Report of the International Committee on Taxonomy of Viruses. </em>
<em>Virus Res</em>
<em>J Gen Virol</em>
<em>J Gen Virol</em>
<em>J Gen Virol</em>
<em>Virol J</em>
<em>Virology</em>
<em>Virus Res</em>
<em>Arch Virol</em>
<em>Parasitology</em>
<em>Vet Parasitol</em>
<em>Am J Trop Med Hyg</em>
<em>Emerg Infect Dis</em>
<em>Antiviral Res</em>
<em>Virology Journal</em>

In [110]:
soup.find_all('div', class_="figs")


Out[110]:
[<div class="figs">
<div class="fig"><p><a href="http://www.virologyj.com/content/11/1/26/figure/F1" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F1','F1',800,470); return false;"><img align="top" alt="thumbnail" class="thumbnail" src="/content/figures/1743-422X-11-26-1.gif"/><strong>Figure 1.</strong></a> <strong>Organization of the LITRV genome. A)</strong> Schematic representation of LITRV ORFs. <strong>B)</strong> LITRV coding regions and corresponding transcription regulatory sequences.
      </p></div>
<div class="table"><p><a href="http://www.virologyj.com/content/11/1/26/table/T1" onclick="popup('http://www.virologyj.com/content/11/1/26/table/T1','T1',800,470); return false;"><strong>Table 1.</strong></a> <strong>Comparison of LITRV and MOUV ORFs</strong></p></div>
<div class="fig"><p><a href="http://www.virologyj.com/content/11/1/26/figure/F2" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F2','F2',800,470); return false;"><img align="top" alt="thumbnail" class="thumbnail" src="/content/figures/1743-422X-11-26-2.gif"/><strong>Figure 2.</strong></a> <strong>Phylogeny of LITRV.</strong> Maximum likelihood phylogenetic tree based on full length L protein sequences of
         currently recognized species of <em>Rhabdoviridae</em>. Gray boxes represent ICTV-accepted species within the genus. LITRV is indicated
         by *. Accession numbers are provided next to the viral names. Due to excessive divergence,
         only the type species for <em>Cytorhabdovirus</em>, <em>Novirhabdovirus</em> and <em>Nucleorhabdovirus</em> are included. The <em>Vesiculovirus</em> piry virus is not included due to limited available sequence.
      </p></div></div>]

In [111]:
divfig = soup.find_all('div', class_="fig")
print(divfig)


[<div class="fig"><p><a href="http://www.virologyj.com/content/11/1/26/figure/F1" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F1','F1',800,470); return false;"><img align="top" alt="thumbnail" class="thumbnail" src="/content/figures/1743-422X-11-26-1.gif"/><strong>Figure 1.</strong></a> <strong>Organization of the LITRV genome. A)</strong> Schematic representation of LITRV ORFs. <strong>B)</strong> LITRV coding regions and corresponding transcription regulatory sequences.
      </p></div>, <div class="fig"><p><a href="http://www.virologyj.com/content/11/1/26/figure/F2" onclick="popup('http://www.virologyj.com/content/11/1/26/figure/F2','F2',800,470); return false;"><img align="top" alt="thumbnail" class="thumbnail" src="/content/figures/1743-422X-11-26-2.gif"/><strong>Figure 2.</strong></a> <strong>Phylogeny of LITRV.</strong> Maximum likelihood phylogenetic tree based on full length L protein sequences of
         currently recognized species of <em>Rhabdoviridae</em>. Gray boxes represent ICTV-accepted species within the genus. LITRV is indicated
         by *. Accession numbers are provided next to the viral names. Due to excessive divergence,
         only the type species for <em>Cytorhabdovirus</em>, <em>Novirhabdovirus</em> and <em>Nucleorhabdovirus</em> are included. The <em>Vesiculovirus</em> piry virus is not included due to limited available sequence.
      </p></div>]

Useful for later: print(soup.get_text())


In [112]:
dirname = os.path.splitext("virologyj-11-1-26.html")[0]

In [113]:
dirname = ("virologyj-11-1-26.html")
chomped = dirname[:-5]
print dirname
print chomped


virologyj-11-1-26.html
virologyj-11-1-26

In [114]:
file = open("./"+chomped+"/virology-fig-image-links.txt", "wb")

In [115]:
for tr in divfig:
    for link in tr.find_all('a', href=True):
        fullink = link.get ('href').encode("utf8")
        #print fullink #print in terminal to verify results
        dirname = os.path.splitext("sys.argv[1]")[0]
        file.write(fullink+'\n')

In [116]:
file.flush()
file.close()

So, we can get the URLs (above) from divfig and the plaintext captions (below)


In [117]:
dirname = ("virologyj-11-1-26.html")
chomped = dirname[:-5]
print dirname
print chomped


virologyj-11-1-26.html
virologyj-11-1-26

In [118]:
file = open("./"+chomped+"/virologycaptions.txt", "wb")

In [119]:
for tr in divfig:
    for link in tr.find_all('p'):
        plaintext = link.get_text ()
        oneline = plaintext.replace('\n',' ').replace('          ',' ').encode("utf8")
        print oneline
        file.write(oneline+'\n')


Figure 1. Organization of the LITRV genome. A) Schematic representation of LITRV ORFs. B) LITRV coding regions and corresponding transcription regulatory sequences.       
Figure 2. Phylogeny of LITRV. Maximum likelihood phylogenetic tree based on full length L protein sequences of currently recognized species of Rhabdoviridae. Gray boxes represent ICTV-accepted species within the genus. LITRV is indicated by *. Accession numbers are provided next to the viral names. Due to excessive divergence, only the type species for Cytorhabdovirus, Novirhabdovirus and Nucleorhabdovirus are included. The Vesiculovirus piry virus is not included due to limited available sequence.       

In [120]:
file.flush()
file.close()

In [121]:
dirname = ("virologyj-11-1-26.html")
chomped = dirname[:-5]
print dirname
print chomped


virologyj-11-1-26.html
virologyj-11-1-26

In [129]:
file = open("./"+chomped+"/"+chomped+"-gif.txt", "wb")

In [130]:
for link in soup.find('h5'):
    print(link.get('href'))
    baseurl = link.get('href')
    print baseurl


http://www.virologyj.com
http://www.virologyj.com

In [131]:
for tr in divfig:
    for link in tr.find_all('img', src=True):
        fullink = link.get ('src').encode("utf8")
        print fullink #print in terminal to verify results
        dirname = os.path.splitext("sys.argv[1]")[0]
        file.write(baseurl+fullink+'\n')


/content/figures/1743-422X-11-26-1.gif
/content/figures/1743-422X-11-26-2.gif

In [132]:
file.flush()
file.close()

In [125]:
for link in soup.find('h5'):
    print(link.get('href'))


http://www.virologyj.com

In [135]:
jazzy = "/content/figures/1743-422X-11-26-1.gif"
print jazzy


/content/figures/1743-422X-11-26-1.gif

In [ ]: