Description:

  • This notebook goes through the assessment of CRISPR loci presence/absence in the database

Before running this notebook:

User-defined variables


In [11]:
# directory where you want the spacer blasting to be done
## CHANGE THIS!
workDir = "/home/nyoungb2/t/CLdb_Ecoli/CRISPR_pres-abs/"

Init


In [12]:
import os
from IPython.display import FileLinks
%load_ext rpy2.ipython


The rpy2.ipython extension is already loaded. To reload it, use:
  %reload_ext rpy2.ipython

In [13]:
if not os.path.isdir(workDir):
    os.makedirs(workDir)

In [14]:
# checking that CLdb is in $PATH & ~/.CLdb config file is set up
!CLdb --config-params


#-- Config params --#
DATABASE = /home/nyoungb2/t/CLdb_Ecoli/CLdb.sqlite

Pres-abs

Spacer pres-abs table

  • in ITOL format for easy plotting on a tree

In [15]:
!cd $workDir; \
    CLdb -- subtypePA_ITOL -h


Usage:
    subtypePA_ITOL.pl [flags] > subtype_PA.meta

  Required flags:
    -database <char>
        CLdb database.

  Optional flags:
    -colors <char>
        For providing user-defined hexidecimal colors (>= 1 argument).

    -abundance <bool>
        Provide counts of subtypes per taxon instead of binary pres-abs?
        [FALSE]

    -subtype <char>
        Refine query to specific a subtype(s) (>1 argument allowed).

    -taxon_id <char>
        Refine query to specific a taxon_id(s) (>1 argument allowed).

    -taxon_name <char>
        Refine query to specific a taxon_name(s) (>1 argument allowed).

    -group <bool>
        Get array elements de-replicated by group (ie. all uniqe sequences).
        [FALSE]

    -verbose <bool>
        Verbose output. [FALSE]

    -help <bool>
        This help message

  For more information:
    CLdb_perldoc subtypePA_ITOL.pl


In [16]:
!cd $workDir; \
    CLdb -- subtypePA_ITOL


LABELS	I-E
COLORS	#FF0000
Escherichia_coli_K-12_MG1655	1
Escherichia_coli_K-12_W3110	1
Escherichia_coli_BL21_DE3	1
Escherichia_coli_O157_H7	1
Escherichia_coli_K-12_DH10B	1

Notes

Not a very exciting table!

We can try including loci abundances...


In [17]:
!cd $workDir; \
    CLdb -- subtypePA_ITOL -a


LABELS	I-E
COLORS	#FF0000
Escherichia_coli_BL21_DE3	2
Escherichia_coli_O157_H7	2
Escherichia_coli_K-12_W3110	2
Escherichia_coli_K-12_DH10B	2
Escherichia_coli_K-12_MG1655	2

Notes

Again, not a very exciting table! But this was just an exercise in running this script.

You could then load this table in ITOL with a tree to plot pres-abs.