Copy corrections is based on copyrighter. One copy database for each Greengene taxon at each level is provided by the tool. We will use that database for correcting our Greengene taxonomy abundance and OTU abundance.


In [27]:
cd ~/Desktop/SSUsearch/


/home/gjr/Desktop/SSUsearch

In [28]:
### set up directory
!mkdir -p ./workdir/copy_correction

In [29]:
Prefix='SS'    # name for the analysis run
Script_dir='./scripts'
Wkdir='./workdir'
Design='./data/test/SS.design'
Otu_dist_cutoff='0.05'
Copy_db='./data/SSUsearch_db/Copy_db.copyrighter.txt'

In [30]:
import os
Script_dir=os.path.abspath(Script_dir)
Wkdir=os.path.abspath(Wkdir)
Design=os.path.abspath(Design)
Copy_db=os.path.abspath(Copy_db)
New_path = '{}:{}'.format('~/Desktop/SSUsearch/external_tools/bin/', os.environ['PATH'])

print New_path

os.environ.update(
    {'PATH':New_path,
     'Prefix':Prefix,
     'Script_dir': Script_dir, 
     'Wkdir': Wkdir, 
     'Otu_dist_cutoff':Otu_dist_cutoff,
     'Design': Design, 
     'Copy_db': Copy_db})


~/Desktop/SSUsearch/external_tools/bin/:~/Desktop/SSUsearch/external_tools/bin/:/home/gjr/anaconda/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games

In [31]:
cd ./workdir/copy_correction


/home/gjr/Desktop/SSUsearch/workdir/copy_correction

In [32]:
# get input files from '/usr/local/notebooks/workdir/clust'
!ln -sf $Wkdir/clust/$Prefix.biom
!ln -sf $Wkdir/clust/$Prefix.list

In [40]:
%%bash

# get Greengene taxonomy
cat $Wkdir/*.ssu.out/*.gg.taxonomy > $Prefix.taxonomy
mothur "#classify.otu(list=$Prefix.list, taxonomy=$Prefix.taxonomy, label=$Otu_dist_cutoff)"
mv SS.$Otu_dist_cutoff.cons.taxonomy SS.cons.taxonomy
mv SS.$Otu_dist_cutoff.cons.tax.summary SS.cons.tax.summary







mothur v.1.33.3
Last updated: 4/4/2014

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

Type 'quit()' to exit program



mothur > classify.otu(list=SS.list, taxonomy=SS.taxonomy, label=0.05)
reftaxonomy is not required, but if given will keep the rankIDs in the summary file static.
0.05	78

Output File Names: 
SS.0.05.cons.taxonomy
SS.0.05.cons.tax.summary

[WARNING]: your sequence names contained ':'.  I changed them to '_' to avoid problems in your downstream analysis.

mothur > quit()

In [36]:
%%bash

Label=userLabel
#Label=dummy

mothur "#make.shared(biom=$Prefix.biom)"

# do copy correction and even sampling
python $Script_dir/copyrighter-otutable.py $Copy_db \
    $Prefix.cons.taxonomy \
    $Prefix.shared $Prefix.cc.shared
    
mv $Prefix.cc.shared $Prefix.shared
mothur "#make.biom(shared=$Prefix.shared, constaxonomy=$Prefix.cons.taxonomy);"
mv $Prefix.$Label.biom $Prefix.biom
rm -f mothur.*.logfile







mothur v.1.33.3
Last updated: 4/4/2014

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

Type 'quit()' to exit program



mothur > make.shared(biom=SS.biom)

dummy

Output File Names: 
SS.shared
SS.1c.rabund
SS.1d.rabund
SS.2c.rabund
SS.2d.rabund


mothur > quit()






mothur v.1.33.3
Last updated: 4/4/2014

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

Type 'quit()' to exit program



mothur > make.biom(shared=SS.shared, constaxonomy=SS.cons.taxonomy)
dummy

Output File Names: 
SS.dummy.biom


mothur > quit()

SS.biom can be further used for diversity analysis, important but not focus of this tutorial (details see mothur wiki).


In [39]:
%%bash

Label=userLabel
#Label=dummy
mothur "#make.shared(biom=$Prefix.biom); sub.sample(shared=$Prefix.shared); summary.single(calc=nseqs-coverage-sobs-chao-shannon-invsimpson); dist.shared(calc=braycurtis); pcoa(phylip=$Prefix.$Label.subsample.braycurtis.$Label.lt.dist); nmds(phylip=$Prefix.$Label.subsample.braycurtis.$Label.lt.dist); amova(phylip=$Prefix.$Label.subsample.braycurtis.$Label.lt.dist, design=$Design); tree.shared(calc=braycurtis); unifrac.weighted(tree=$Prefix.$Label.subsample.braycurtis.$Label.tre, group=$Design, random=T)"
rm -f mothur.*.logfile; 
rm -f *.rabund







mothur v.1.33.3
Last updated: 4/4/2014

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

Type 'quit()' to exit program



mothur > make.shared(biom=SS.biom)

dummy

Output File Names: 
SS.shared
SS.1c.rabund
SS.1d.rabund
SS.2c.rabund
SS.2d.rabund


mothur > sub.sample(shared=SS.shared)
Sampling 11 from each group.
dummy

Output File Names: 
SS.dummy.subsample.shared


mothur > summary.single(calc=nseqs-coverage-sobs-chao-shannon-invsimpson)
Using SS.dummy.subsample.shared as input file for the shared parameter.

Processing group 1c

dummy

Processing group 1d

dummy

Processing group 2c

dummy

Processing group 2d

dummy

Output File Names: 
SS.dummy.subsample.groups.summary


mothur > dist.shared(calc=braycurtis)
Using SS.dummy.subsample.shared as input file for the shared parameter.

Using 1 processors.
dummy

Output File Names: 
SS.dummy.subsample.braycurtis.dummy.lt.dist


mothur > pcoa(phylip=SS.dummy.subsample.braycurtis.dummy.lt.dist)

Processing...
Rsq 1 axis: 0.970937
Rsq 2 axis: 0.852719
Rsq 3 axis: 1

Output File Names: 
SS.dummy.subsample.braycurtis.dummy.lt.pcoa.axes
SS.dummy.subsample.braycurtis.dummy.lt.pcoa.loadings


mothur > nmds(phylip=SS.dummy.subsample.braycurtis.dummy.lt.dist)
Processing Dimension: 2
1
2
3
4
5
6
7
8
9
10

Number of dimensions:	2
Lowest stress :	0.126906
R-squared for configuration:	0.708227

Output File Names: 
SS.dummy.subsample.braycurtis.dummy.lt.nmds.iters
SS.dummy.subsample.braycurtis.dummy.lt.nmds.stress
SS.dummy.subsample.braycurtis.dummy.lt.nmds.axes


mothur > amova(phylip=SS.dummy.subsample.braycurtis.dummy.lt.dist, design=/home/gjr/Desktop/SSUsearch/data/test/SS.design)
c-d	Among	Within	Total
SS	0.526859	0.615703	1.14256
df	1	2	3
MS	0.526859	0.307851

Fs:	1.71141
p-value: 0.328

Experiment-wise error rate: 0.05
If you have borderline P-values, you should try increasing the number of iterations

Output File Names: 
SS.dummy.subsample.braycurtis.dummy.lt.amova


mothur > tree.shared(calc=braycurtis)
Using SS.dummy.subsample.shared as input file for the shared parameter.

Using 1 processors.
dummy

Output File Names: 
SS.dummy.subsample.braycurtis.dummy.tre


mothur > unifrac.weighted(tree=SS.dummy.subsample.braycurtis.dummy.tre, group=/home/gjr/Desktop/SSUsearch/data/test/SS.design, random=T)

Using 1 processors.
Tree#	Groups	WScore	WSig
1	c-d	0.928571	<0.0010
It took 0 secs to run unifrac.weighted.

Output File Names: 
SS.dummy.subsample.braycurtis.dummy.trewsummary
SS.dummy.subsample.braycurtis.dummy.tre1.weighted


mothur > quit()

In [ ]: