Description:
General workflow outline for the paper:
Dataset generation
- Retrieve complete bacterial & archaeal genomes from NCBI
- Select 1 representive from each species
GC sliding window analysis
- Identify ssu, lsu, and tsu genes in each genom
- For each genome: sliding window analysis of GC around each 16S rRNA gene
Fragment simulation
- For each genome: simulate 1000 amplicons & 10000 shotgun reads
- For each amplicon/read: simulate a template fragment
- For each template fragment: calculate GC content and buoyant density (BD)
- For all fragments from each genome: bin fragments by BD and assess the distributions