Olga Botvinnik, 4th year PhD candidate in Bioinformatics in Prof. Gene Yeo's lab at UC San Diego. I study alternative splicing in single cells.
By the end of this course, you will be able to ...
Understanding is first, tools are secondary.
We will focus on learning what different algorithms are, and when you would apply them. This is more of a "theoretical bioinformatics" course than a practical one. This is on purpose - programming languages change, but math is forever! Even diamonds aren't truly forever - they have a favorable free energy of spontaneous transition to graphite, albeit with a high activation energy (>1000C).
Many people say they want to learn bioinformatics but never do. The only thing that works is when they're confronted with an immediate need -- then they have to learn it. My goal is to light the fire of the immediate need and provide you the conceptual understanding of what algorithms do what, so you may find tutorials and people to learn programming from, but not to teach programming itself.
The depth of sequencing depends on your problem and your library preparation and so on. If you're using unique molecular identifiers and sequencing only the 3' end, apparently 100,000 reads/cell is enough. If you're doing alternative splicing and are interested in full transcript coverage, then at least 10 million reads/cell is necessary (in my experience).
There are many resources comparing different mapping and gene expression algorithms, so I will not be covering this.
There are several different techniques for dealing with barcodes and the exact strategy you use depends a lot on your protocol and library prep methods and so we won't be getting into this.
Specifically, git. You did an exercise for this and you should use version control every time you write code! Unless you are a perfect programmer and always write your code exactly right the first time. Then you should be inventing programming languages (but still using version control).
Python vs R vs MATLAB vs Fortran vs ...
This is taught in Python because it's what I know best but could be taught in any language.
Single-cell analyses are primarly done to deconstruct a population (e.g. a tissue or 10cm plate) into its constituent parts (cells!)
Kolodziejczyk et al, Mol Cell (2015)
Kolodziejczyk et al, Mol Cell (2015)
We'll be covering what's outlined in blue boxes.