Comparison between `CAI` and `Biopython` Performance

To see how Biopython and CAI perform, we're going to benchmark them. First, let's get the latest version of CAI.



In [1]:

    
! pip install -e ../









    



Obtaining file:///Users/BenjaminLee/Desktop/Python/Research/cai/CodonAdaptionIndex
Requirement already satisfied: scipy in /Users/BenjaminLee/Desktop/Python/Research/cai/env/lib/python3.6/site-packages (from CAI==0.1.8) (1.1.0)
Requirement already satisfied: biopython in /Users/BenjaminLee/Desktop/Python/Research/cai/env/lib/python3.6/site-packages (from CAI==0.1.8) (1.71)
Requirement already satisfied: click in /Users/BenjaminLee/Desktop/Python/Research/cai/env/lib/python3.6/site-packages (from CAI==0.1.8) (6.7)
Requirement already satisfied: numpy>=1.8.2 in /Users/BenjaminLee/Desktop/Python/Research/cai/env/lib/python3.6/site-packages (from scipy->CAI==0.1.8) (1.14.3)
Installing collected packages: CAI
  Found existing installation: CAI 0.1.8
    Uninstalling CAI-0.1.8:
      Successfully uninstalled CAI-0.1.8
  Running setup.py develop for CAI
Successfully installed CAI

Now, we'll import the two libraries.



In [2]:

    
from Bio import SeqIO

from CAI import CAI, relative_adaptiveness
from Bio.SeqUtils import CodonUsage

We're going to use the highly expressed genes of E. coli for our reference set as well as a test set of 100 3000bp CDSs generated from the Sequence Manipulation Site.



In [3]:

    
reference = [str(seq.seq) for seq in SeqIO.parse("ecoli.heg.fasta", "fasta")]
sequence = [str(seq.seq) for seq in SeqIO.parse("test.fasta", "fasta")]

`Biopython`



In [4]:

    
bp = CodonUsage.CodonAdaptationIndex()
bp.generate_index("ecoli.heg.fasta")
%timeit [bp.cai_for_gene(seq) for seq in sequence]









    



777 ms ± 36.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

`CAI`



In [5]:

    
weights = relative_adaptiveness(sequences=sequence)
%timeit [CAI(seq, weights=weights) for seq in sequence]









    



469 ms ± 18.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Comparison between CAI and Biopython Performance

Biopython

CAI

Comparison between `CAI` and `Biopython` Performance

`Biopython`

`CAI`