Description

  • Time to make a simple SIP data simulation with the dataset that you alreadly created

Setting variables

  • "workDir" is the path to the working directory for this analysis (where the files will be download to)
    • NOTE: MAKE SURE to modify this path to the directory where YOU want to run the example.
  • "nprocs" is the number of processors to use (3 by default, since only 3 genomes). Change this if needed.

In [1]:
workDir = '/home/nick/t/SIPSim/'
nprocs = 3

Init


In [2]:
import os
import glob

In [3]:
%load_ext rpy2.ipython

In [4]:
%%R
library(ggplot2)
library(dplyr)
library(tidyr)


/opt/anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.py:106: UserWarning: 
Attaching package: ‘dplyr’


  res = super(Function, self).__call__(*new_args, **new_kwargs)
/opt/anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.py:106: UserWarning: The following objects are masked from ‘package:stats’:

    filter, lag


  res = super(Function, self).__call__(*new_args, **new_kwargs)
/opt/anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.py:106: UserWarning: The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


  res = super(Function, self).__call__(*new_args, **new_kwargs)

In [5]:
if not os.path.isdir(workDir):
    os.makedirs(workDir)
%cd $workDir    

genomeDir = os.path.join(workDir, 'genomes_rn')


/home/nick/t/SIPSim

Experimental design

  • How many gradients?
  • Which are labeled treatments & which are controls?
  • For this tutorial, we'll keep things simple and just simulate one control & one treatment
    • For the labeled treatment, 34% of the taxa (1 of 3) will incorporate 50% isotope

The script below ("SIPSim incorpConfigExample") is helpful for making simple experimental designs


In [6]:
# this file
!SIPSim incorpConfigExample \
  --percTaxa 34 \
  --percIncorpUnif 50 \
  --n_reps 1 \
  > incorp.config

In [7]:
!cat incorp.config


[1]
    # baseline: no incorporation
    treatment = control
    
    [[intraPopDist 1]]
        distribution = uniform
        
        [[[start]]]
            
            [[[[interPopDist 1]]]]
                distribution = uniform
                start = 0
                end = 0
        
        [[[end]]]
            
            [[[[interPopDist 1]]]]
                distribution = uniform
                start = 0
                end = 0
[2]
    # 'treatment' community: possible incorporation
    treatment = labeled
    max_perc_taxa_incorp = 34
    
    [[intraPopDist 1]]
        distribution = uniform
        
        [[[start]]]
            [[[[interPopDist 1]]]]
                start = 50
                distribution = uniform
                end = 50
        
        [[[end]]]
            [[[[interPopDist 1]]]]
                start = 50
                distribution = uniform
                end = 50
    

Pre-fractionation communities

  • What is the relative abundance of taxa in the pre-fractionation samples?

In [8]:
!SIPSim communities \
    --config incorp.config \
    ./genomes_rn/genome_index.txt \
    > comm.txt

In [9]:
!cat comm.txt


library	taxon_name	rel_abund_perc	rank
1	Clostridium_ljungdahlii_DSM_13528	64.715653095	1
1	Streptomyces_pratensis_ATCC_33331	31.317397427	2
1	Escherichia_coli_1303	3.966949478	3
2	Clostridium_ljungdahlii_DSM_13528	67.718695142	1
2	Streptomyces_pratensis_ATCC_33331	26.828563580	2
2	Escherichia_coli_1303	5.452741278	3

Note: "library" = gradient

Simulating gradient fractions

  • BD size ranges for each fraction (& start/end of the total BD range)

In [10]:
!SIPSim gradient_fractions \
    --BD_min 1.67323 \
    --BD_max 1.7744 \
    comm.txt \
    > fracs.txt

In [11]:
!head -n 6 fracs.txt


library	fraction	BD_min	BD_max	fraction_size
1	1	1.673	1.676	0.003
1	2	1.676	1.68	0.004
1	3	1.68	1.686	0.006
1	4	1.686	1.69	0.004
1	5	1.69	1.694	0.004

Simulating fragments

  • Simulating amplicon-fragments
    • gDNA fragments that contain the template for the amplicon
    • These fragments are the actual DNA fragments distributed in the CsCl gradient
      • But, only the PCR amplicons generated from these fragments are what we sequence.
  • Fragment length distribution: skewed-normal

Primer sequences


In [12]:
primers = """>515F
GTGCCAGCMGCCGCGGTAA
>806R
GGACTACHVGGGTWTCTAAT
"""

F = os.path.join(workDir, '515F-806R.fna')
with open(F, 'wb') as oFH:
    oFH.write(primers)
    
print 'File written: {}'.format(F)


File written: /home/nick/t/SIPSim/515F-806R.fna

Simulation


In [13]:
# skewed-normal
!SIPSim fragments \
    $genomeDir/genome_index.txt \
    --fp $genomeDir \
    --fr 515F-806R.fna \
    --fld skewed-normal,9000,2500,-5 \
    --flr None,None \
    --nf 10000 \
    --np $nprocs \
    --tbl \
    > ampFrags.txt


Processing: "Clostridium_ljungdahlii_DSM_13528"
Processing: "Escherichia_coli_1303"
Processing: "Streptomyces_pratensis_ATCC_33331"
  Genome name: Escherichia_coli_1303
  Genome length (bp): 4948797
  Number of amplicons: 7
  Number of fragments simulated: 10000
  Genome name: Streptomyces_pratensis_ATCC_33331
  Genome name: Clostridium_ljungdahlii_DSM_13528
  Genome length (bp): 4630065
  Number of amplicons: 9
  Number of fragments simulated: 10000
  Genome length (bp): 7337497
  Number of amplicons: 6
  Number of fragments simulated: 10000

In [14]:
!head -n 5 ampFrags.txt


taxon_name	scaffoldID	fragStart	fragLength	fragGC
Clostridium_ljungdahlii_DSM_13528	Clostridium_ljungdahlii_DSM_13528	684708	7863	42.4138369579
Clostridium_ljungdahlii_DSM_13528	Clostridium_ljungdahlii_DSM_13528	105534	7909	38.247566064
Clostridium_ljungdahlii_DSM_13528	Clostridium_ljungdahlii_DSM_13528	108807	6586	44.5034922563
Clostridium_ljungdahlii_DSM_13528	Clostridium_ljungdahlii_DSM_13528	686241	7309	43.3027773977

Plotting fragments


In [15]:
%%R -w 700 -h 350

df = read.delim('ampFrags.txt')

ggplot(df, aes(fragGC, fragLength, color=taxon_name)) +
    geom_density2d() +
    scale_color_discrete('Taxon') +
    labs(x='Fragment G+C', y='Fragment length (bp)') +
    theme_bw() +
    theme(
        text = element_text(size=16)
    )


Note: for information on what's going on in this config file, use the command: SIPSim isotope_incorp -h

Converting fragments to a 2d-KDE

  • Estimating the joint-probabilty for fragment G+C & length

In [16]:
!SIPSim fragment_KDE \
    ampFrags.txt \
    > ampFrags_kde.pkl
  • Note: The generated list of KDEs (1 per taxon per gradient) are in a binary file format
    • To get a table of length/G+C values, use the command: SIPSim KDE_sample

Adding diffusion

  • Simulating the BD distribution of fragments as Gaussian distributions.
    • One Gaussian distribution per homogeneous set of DNA molecules (same G+C and length)

In [17]:
!SIPSim diffusion \
    ampFrags_kde.pkl \
    --np $nprocs \
    > ampFrags_kde_dif.pkl


Index size: 90508
Processing: Streptomyces_pratensis_ATCC_33331
Processing: Escherichia_coli_1303
Processing: Clostridium_ljungdahlii_DSM_13528

Plotting fragment distribution w/ and w/out diffusion

Making a table of fragment values from KDEs


In [20]:
n = 100000
!SIPSim KDE_sample -n $n ampFrags_kde.pkl > ampFrags_kde.txt
!SIPSim KDE_sample -n $n ampFrags_kde_dif.pkl > ampFrags_kde_dif.txt

!ls -thlc ampFrags_kde*.txt


-rw-rw-r-- 1 nick nick 4.2M Jun 28 14:10 ampFrags_kde_dif.txt
-rw-rw-r-- 1 nick nick 4.2M Jun 28 14:10 ampFrags_kde.txt

Plotting


In [32]:
%%R
df1 = read.delim('ampFrags_kde.txt', sep='\t')
df2 = read.delim('ampFrags_kde_dif.txt', sep='\t')

df1$data = 'no diffusion'
df2$data = 'diffusion'
df = rbind(df1, df2) %>%
    gather(Taxon, BD, Clostridium_ljungdahlii_DSM_13528, 
           Escherichia_coli_1303, Streptomyces_pratensis_ATCC_33331) %>%
    mutate(Taxon = gsub('_(ATCC|DSM)', '\n\\1', Taxon))

df %>% head(n=3)


  libID         data                              Taxon       BD
1     1 no diffusion Clostridium_ljungdahlii\nDSM_13528 1.701521
2     1 no diffusion Clostridium_ljungdahlii\nDSM_13528 1.701100
3     1 no diffusion Clostridium_ljungdahlii\nDSM_13528 1.702217

In [33]:
%%R -w 800 -h 300

ggplot(df, aes(BD, fill=data)) +
    geom_density(alpha=0.25) +
    facet_wrap( ~ Taxon) +    
    scale_fill_discrete('') +
    theme_bw() +
    theme(
        text=element_text(size=16),
        axis.title.y = element_text(vjust=1),
        axis.text.x = element_text(angle=50, hjust=1)
        )


Adding diffusive boundary layer (DBL) effects


In [34]:
!SIPSim DBL \
    ampFrags_kde_dif.pkl \
    --np $nprocs \
    > ampFrags_kde_dif_DBL.pkl


DBL_index file written: "DBL_index.txt"
Processing: Clostridium_ljungdahlii_DSM_13528
Processing: Streptomyces_pratensis_ATCC_33331
Processing: Escherichia_coli_1303

In [35]:
# viewing DBL logs
!ls -thlc *pkl


-rw-rw-r-- 1 nick nick  12M Jun 28 14:16 ampFrags_kde_dif_DBL.pkl
-rw-rw-r-- 1 nick nick  12M Jun 28 14:08 ampFrags_kde_dif.pkl
-rw-rw-r-- 1 nick nick 471K Jun 28 14:04 ampFrags_kde.pkl

Adding isotope incorporation

  • Using the config file produced in the Experimental Design section

In [61]:
!SIPSim isotope_incorp \
    --comm comm.txt \
    --np $nprocs \
    ampFrags_kde_dif_DBL.pkl \
    incorp.config \
    > ampFrags_KDE_dif_DBL_inc.pkl


Loading KDE object...
Processing library: 1
Processing: Clostridium_ljungdahlii_DSM_13528
Processing: Escherichia_coli_1303
Processing: Streptomyces_pratensis_ATCC_33331
Processing library: 2
WARNING: config library 2 not found in KDEs.Using a different KDE object
Processing: Clostridium_ljungdahlii_DSM_13528
Processing: Escherichia_coli_1303
Processing: Streptomyces_pratensis_ATCC_33331
File written: BD-shift_stats.txt

In [62]:
!ls -thlc *.pkl


-rw-rw-r-- 1 nick nick  23M Jun 28 14:32 ampFrags_KDE_dif_DBL_inc.pkl
-rw-rw-r-- 1 nick nick  12M Jun 28 14:16 ampFrags_kde_dif_DBL.pkl
-rw-rw-r-- 1 nick nick  12M Jun 28 14:08 ampFrags_kde_dif.pkl
-rw-rw-r-- 1 nick nick 471K Jun 28 14:04 ampFrags_kde.pkl

Note: statistics on how much isotope was incorporated by each taxon are listed in "BD-shift_stats.txt"


In [63]:
%%R
df = read.delim('BD-shift_stats.txt', sep='\t')
df


  library                             taxon          min          q25
1       1 Clostridium_ljungdahlii_DSM_13528 2.220446e-15 5.995204e-15
2       1             Escherichia_coli_1303 1.376677e-14 1.554312e-14
3       1 Streptomyces_pratensis_ATCC_33331 3.108624e-15 4.884981e-15
4       2 Clostridium_ljungdahlii_DSM_13528 0.000000e+00 0.000000e+00
5       2             Escherichia_coli_1303 1.800000e-02 1.800000e-02
6       2 Streptomyces_pratensis_ATCC_33331 0.000000e+00 0.000000e+00
          mean       median          q75          max
1 9.724844e-15 9.769963e-15 1.354472e-14 1.731948e-14
2 1.733329e-14 1.731948e-14 1.909584e-14 2.109424e-14
3 6.595556e-15 6.661338e-15 8.437695e-15 9.992007e-15
4 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
5 1.800000e-02 1.800000e-02 1.800000e-02 1.800000e-02
6 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00

Making an OTU table

  • Number of amplicon-fragment in each fraction in each gradient
  • Assuming a total pre-fractionation community size of 1e7

In [64]:
!SIPSim OTU_table \
    --abs 1e7 \
    --np $nprocs \
    ampFrags_KDE_dif_DBL_inc.pkl \
    comm.txt \
    fracs.txt \
    > OTU.txt


Loading files...
Simulating OTUs...
Processing library: "1"
  Processing taxon: "Clostridium_ljungdahlii_DSM_13528"
   taxon abs-abundance:  6471565
  Processing taxon: "Streptomyces_pratensis_ATCC_33331"
   taxon abs-abundance:  3131740
  Processing taxon: "Escherichia_coli_1303"
   taxon abs-abundance:  396695
Processing library: "2"
  Processing taxon: "Clostridium_ljungdahlii_DSM_13528"
   taxon abs-abundance:  6771870
  Processing taxon: "Streptomyces_pratensis_ATCC_33331"
   taxon abs-abundance:  2682856
  Processing taxon: "Escherichia_coli_1303"
   taxon abs-abundance:  545274

In [65]:
!head -n 7 OTU.txt


library	taxon	fraction	BD_min	BD_mid	BD_max	count	rel_abund
1	Clostridium_ljungdahlii_DSM_13528	-inf-1.673	-inf	1.672	1.672	1539	0.782409761057
1	Clostridium_ljungdahlii_DSM_13528	1.673-1.676	1.673	1.675	1.676	122	0.462121212121
1	Clostridium_ljungdahlii_DSM_13528	1.676-1.680	1.676	1.678	1.68	230	0.756578947368
1	Clostridium_ljungdahlii_DSM_13528	1.680-1.686	1.68	1.683	1.686	6312	0.976787372331
1	Clostridium_ljungdahlii_DSM_13528	1.686-1.690	1.686	1.688	1.69	140925	0.999347596389
1	Clostridium_ljungdahlii_DSM_13528	1.690-1.694	1.69	1.692	1.694	776468	0.99981329182

Plotting fragment count distributions


In [77]:
%%R -h 350 -w 750

df = read.delim('OTU.txt', sep='\t')


p = ggplot(df, aes(BD_mid, count, fill=taxon)) +
    geom_area(stat='identity', position='dodge', alpha=0.5) +
    scale_x_continuous(expand=c(0,0)) +
    labs(x='Buoyant density') +
    labs(y='Amplicon fragment counts') +
    facet_grid(library ~ .) +
    theme_bw() +
    theme( 
        text = element_text(size=16),
        axis.title.y = element_text(vjust=1),        
        axis.title.x = element_blank()
    )
p


Notes:

  • This plot represents the theoretical number of amplicon-fragments at each BD across each gradient.
    • Derived from subsampling the fragment BD proability distributions generated in earlier steps.
  • The fragment BD distribution of one of the 3 taxa should have shifted in Gradient 2 (the treatment gradient).
  • The fragment BD distributions of the other 2 taxa should be approx. the same between the two gradients.

Viewing fragment counts as relative quantities


In [79]:
%%R -h 350 -w 750

p = ggplot(df, aes(BD_mid, count, fill=taxon)) +
    geom_area(stat='identity', position='fill') +
    scale_x_continuous(expand=c(0,0)) +
    scale_y_continuous(expand=c(0,0)) +
    labs(x='Buoyant density') +
    labs(y='Amplicon fragment counts') +
    facet_grid(library ~ .) +
    theme_bw() +
    theme( 
        text = element_text(size=16),
        axis.title.y = element_text(vjust=1),        
        axis.title.x = element_blank()
    )
p


Adding effects of PCR

  • This will alter the fragment counts based on the PCR kinetic model of:

    Suzuki MT, Giovannoni SJ. (1996). Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl Environ Microbiol 62:625-630.


In [67]:
!SIPSim OTU_PCR OTU.txt > OTU_PCR.txt

In [68]:
!head -n 5 OTU_PCR.txt


library	taxon	fraction	BD_min	BD_mid	BD_max	count	rel_abund
1	Clostridium_ljungdahlii_DSM_13528	-inf-1.673	-inf	1.672	1.672	9302618	0.465130921533
1	Clostridium_ljungdahlii_DSM_13528	1.673-1.676	1.673	1.675	1.676	8092843	0.404642150502
1	Clostridium_ljungdahlii_DSM_13528	1.676-1.680	1.676	1.678	1.68	9247046	0.462352286332
1	Clostridium_ljungdahlii_DSM_13528	1.680-1.686	1.68	1.683	1.686	12067018	0.603350916911

Notes

  • The table is in the same format as with the original OTU table, but the counts and relative abundances should be altered.

Simulating sequencing

  • Sampling from the OTU table

In [69]:
!SIPSim OTU_subsample OTU_PCR.txt > OTU_PCR_sub.txt

In [70]:
!head -n 5 OTU_PCR_sub.txt


library	fraction	taxon	BD_min	BD_mid	BD_max	count	rel_abund
1	-inf-1.673	Clostridium_ljungdahlii_DSM_13528	-inf	1.672	1.672	13753	0.465840192392
1	1.673-1.676	Clostridium_ljungdahlii_DSM_13528	1.673	1.675	1.676	9977	0.405289027908
1	1.676-1.680	Clostridium_ljungdahlii_DSM_13528	1.676	1.678	1.68	9665	0.463326941515
1	1.680-1.686	Clostridium_ljungdahlii_DSM_13528	1.68	1.683	1.686	13122	0.599862857143

Notes

  • The table is in the same format as with the original OTU table, but the counts and relative abundances should be altered.

Plotting


In [80]:
%%R -h 350 -w 750

df = read.delim('OTU_PCR_sub.txt', sep='\t')


p = ggplot(df, aes(BD_mid, rel_abund, fill=taxon)) +
    geom_area(stat='identity', position='fill') +
    scale_x_continuous(expand=c(0,0)) +
    scale_y_continuous(expand=c(0,0)) +
    labs(x='Buoyant density') +
    labs(y='Taxon relative abundances') +
    facet_grid(library ~ .) +
    theme_bw() +
    theme( 
        text = element_text(size=16),
        axis.title.y = element_text(vjust=1),        
        axis.title.x = element_blank()
    )
p


Notes

  • The BD shift of 1 taxon in the treatment gradient is harder to see after simulating PCR & sequencing versus just looking at all fragments in the gradient (the original OTU table of fragment counts)

Misc

A 'wide' OTU table

  • If you want to reformat the OTU table to a more standard 'wide' format (as used in Mothur or QIIME):

In [81]:
!SIPSim OTU_wideLong -w \
    OTU_PCR_sub.txt \
    > OTU_PCR_sub_wide.txt

In [82]:
!head -n 4 OTU_PCR_sub_wide.txt


taxon	1__-inf-1.673	1__1.673-1.676	1__1.676-1.680	1__1.680-1.686	1__1.686-1.690	1__1.690-1.694	1__1.694-1.699	1__1.699-1.703	1__1.703-1.707	1__1.707-1.712	1__1.712-1.718	1__1.718-1.721	1__1.721-1.729	1__1.729-1.733	1__1.733-1.735	1__1.735-1.738	1__1.738-1.742	1__1.742-1.747	1__1.747-1.750	1__1.750-1.753	1__1.753-1.754	1__1.754-1.757	1__1.757-1.759	1__1.759-1.765	1__1.765-1.771	1__1.771-1.774	1__1.774-inf	2__-inf-1.673	2__1.673-1.674	2__1.674-1.679	2__1.679-1.682	2__1.682-1.684	2__1.684-1.689	2__1.689-1.692	2__1.692-1.698	2__1.698-1.701	2__1.701-1.708	2__1.708-1.709	2__1.709-1.714	2__1.714-1.716	2__1.716-1.721	2__1.721-1.724	2__1.724-1.726	2__1.726-1.731	2__1.731-1.735	2__1.735-1.738	2__1.738-1.744	2__1.744-1.747	2__1.747-1.753	2__1.753-1.758	2__1.758-1.762	2__1.762-1.765	2__1.765-1.769	2__1.769-1.774	2__1.774-inf
Clostridium_ljungdahlii_DSM_13528	13753	9977	9665	13122	15752	10233	22097	15205	15622	7054	5983	4180	2969	3343	4575	8192	8202	13756	9469	7289	8301	9191	14931	7656	8457	8168	5960	12759	12175	10827	9051	12294	15073	14461	18100	19499	22285	12995	9394	6660	4392	1726	1138	1181	3900	3685	8124	6460	9609	7936	8553	7857	6847	8901	7410
Escherichia_coli_1303	6298	4370	4206	3202	575	155	1193	3097	7571	6798	8043	4263	1360	1428	1831	4073	3975	6730	3895	2654	4990	4439	4304	4193	4339	3323	4114	4502	5490	5947	4742	3442	1664	473	496	142	366	534	386	1512	3777	5699	8220	9846	17498	8632	8577	5185	5109	3946	5179	4851	3510	4906	5875
Streptomyces_pratensis_ATCC_33331	9472	10270	6989	5551	2772	1051	1315	1266	2657	4200	12723	18412	15683	10655	8450	10089	6385	10796	7833	5199	6155	7437	9537	7415	7284	6270	7396	8895	8755	7822	6400	6317	3405	982	1118	1321	4761	6006	8920	13573	15809	11761	11230	9317	11433	4136	5687	5314	7120	6540	6613	6953	5355	6828	8140

SIP metadata

  • If you want to make a table of SIP sample metadata

In [83]:
!SIPSim OTU_sampleData \
    OTU_PCR_sub.txt \
    > OTU_PCR_sub_meta.txt

In [86]:
!head OTU_PCR_sub_meta.txt


sample	library	fraction	BD_min	BD_max	BD_mid
1__-inf-1.673	1	-inf-1.673	-inf	1.673	-inf
1__1.673-1.676	1	1.673-1.676	1.673	1.676	1.6745
1__1.676-1.680	1	1.676-1.680	1.676	1.680	1.678
1__1.680-1.686	1	1.680-1.686	1.680	1.686	1.683
1__1.686-1.690	1	1.686-1.690	1.686	1.690	1.688
1__1.690-1.694	1	1.690-1.694	1.690	1.694	1.692
1__1.694-1.699	1	1.694-1.699	1.694	1.699	1.6965
1__1.699-1.703	1	1.699-1.703	1.699	1.703	1.701
1__1.703-1.707	1	1.703-1.707	1.703	1.707	1.705

Other SIPSim commands

SIPSim -l will list all available SIPSim commands


In [87]:
!SIPSim -l


BD_shift: Determine the shift in BD based on KDE overlap
communities: simulate communities in the samples used for SIP
DBL: include diffusion boundary layer contamination into G+C variance
deltaBD: simulate quantitative SIP data
diffusion: incorporate gradient diffusion error in fragment buoyant density estimates
fragment_KDE: make a 2d kernel density estimate of fragment
fragment_KDE_cat: concatenating 2 fragment_kde objects
fragment_parse: parsing out fragment objects for certain genomes
fragments: simulate genome fragments that would be found in a isopycnic gradient
genome_index: index genomes for in-silico PCR; required for
genome_rename: formatting genome sequences in a multi-fasta file for SIPSim
gradient_fractions: Simulate the fraction produced during gradient fractionation
HRSIP: conduct high-resolution SIP method
incorpConfigExample: create example isotope incorporation config file
isotope_incorp.py
KDE_bandwidth: get the bandwidth of each KDE
KDE_info: get info on KDE object files
KDE_parse: parse out KDEs for certain taxa
KDE_plot: make plots of each KDE (1D or 2D)
KDE_sample: sample from each KDE and write a table of values
KDE_selectTaxa: Select a set of taxa to be incorporators
OTU_add_error: adding error to abundance values
OTU_PCR: simulate PCR of gradient fraction DNA samples
OTU_sampleData: make a 'sample_data' table (phyloseq) from the OTU table
OTU_subsample: simulate sequencing by subsampling from an OTU table
OTU_sum: Sum OTU counts (by group)
OTU_table: simulate OTUs for gradient fractions
OTU_wideLong: convert OTU table from wide to long or vice versa
qSIP: simulate quantitative SIP data
qSIP_atomExcess: calculate isotope enrichment from qSIP data
tree_sim: Simulate a phylogeny for a specified set of taxa

Other SIPSim-associated commands

SIPSimR -l will list all available SIPSim R commands for data analyses


In [90]:
!SIPSimR -l


BD_span_calc.r [options] <data> <data_preFrac>
comm_add_richness.r [options] <comm> <richness>
comm_add_target.r [options] <comm> <target>
comm_beta_div.r [options] <comm>
comm_set_abund.r [options] <comm>
comm_shuffle_taxa.r [options] <comm>
correlogram_make.r [options] <data>
DESeq2_rare-dominant.r [options] <DESeq2> <comm> <BD_shift>
DESeq2_combine.r [options] <DESeq2>...
DESeq2_confuseMtx.r [options] <BD_shift> <DESeq2>
DESeq2_listTaxa.r [options] <DESeq2>
DESeq2_rare-dominant.r [options] <DESeq2> <comm>
heavy_confuseMtx.r [options] <BD_shift> <OTU_table>
OTU_taxonAbund.r [options] <OTU>
phyloseq2comm.r [options] <phyloseq>
phyloseq_DESeq2.r [options] <phyloseq>
phyloseq_edit.r [options] <phyloseq>
phyloseq_make.r [options] <OTU>
phyloseq_ordination.r [options] <phyloseq> <outFile>
qSIP_confuseMtx.r [options] <BD_shift> <qSIP_atomExcess>
rTraitContW.r
shannon_calc.r [options] <data>

In [ ]: