Description:

  • This notebook goes through the assessment of CRISPR loci presence/absence in the database

Before running this notebook:

User-defined variables


In [1]:
# directory where you want the spacer blasting to be done
## CHANGE THIS!
workDir = "/home/nyoungb2/t/CLdb_Ecoli/spacer_pres-abs/"

Init


In [2]:
import os
from IPython.display import FileLinks
%load_ext rpy2.ipython

In [3]:
if not os.path.isdir(workDir):
    os.makedirs(workDir)

In [6]:
# checking that CLdb is in $PATH & ~/.CLdb config file is set up
!CLdb --config-params


#-- Config params --#
DATABASE = /home/nyoungb2/t/CLdb_Ecoli/CLdb.sqlite

Pres-abs

Spacer pres-abs table

  • in ITOL format for easy plotting on a tree

In [5]:
!cd $workDir; \
    CLdb -- subtypePA_ITOL -h


Usage:
    subtypePA_ITOL.pl [flags] > subtype_PA.meta

  Required flags:
    -database <char>
        CLdb database.

  Optional flags:
    -colors <char>
        For providing user-defined hexidecimal colors (>= 1 argument).

    -abundance <bool>
        Provide counts of subtypes per taxon instead of binary pres-abs?
        [FALSE]

    -subtype <char>
        Refine query to specific a subtype(s) (>1 argument allowed).

    -taxon_id <char>
        Refine query to specific a taxon_id(s) (>1 argument allowed).

    -taxon_name <char>
        Refine query to specific a taxon_name(s) (>1 argument allowed).

    -group <bool>
        Get array elements de-replicated by group (ie. all uniqe sequences).
        [FALSE]

    -verbose <bool>
        Verbose output. [FALSE]

    -help <bool>
        This help message

  For more information:
    CLdb_perldoc subtypePA_ITOL.pl


In [8]:
!cd $workDir; \
    CLdb -- subtypePA_ITOL


LABELS	I-E
COLORS	#FF0000
Escherichia_coli_K-12_W3110	1
Escherichia_coli_K-12_DH10B	1
Escherichia_coli_BL21_DE3	1
Escherichia_coli_O157_H7	1
Escherichia_coli_K-12_MG1655	1

Notes

Not a very exciting table!

We can try including loci abundances...


In [9]:
!cd $workDir; \
    CLdb -- subtypePA_ITOL -a


LABELS	I-E
COLORS	#FF0000
Escherichia_coli_BL21_DE3	2
Escherichia_coli_K-12_MG1655	2
Escherichia_coli_O157_H7	2
Escherichia_coli_K-12_DH10B	2
Escherichia_coli_K-12_W3110	2

Notes

Again, not a very exciting table! But this was just an exercise in running this script.

You could then load this table in ITOL with a tree to plot pres-abs.


In [ ]: