QC assessment of NGS data

1. The peak is at 140 bp, and the read length is 100 bp. This means that the forward and reverse reads overlap with 60 bp.

2. There are 400252 reads in total.

Look inside the file and locate the field "raw total sequences". To extract the information quickly from multiple files, commands similar to the following can be used:

grep ^SN lane*.sorted.bam.bchk | awk -F'\t' '$2=="raw total sequences:"'

3. 76% of the reads were mapped. Divide "reads mapped" (303036) by "raw total sequences" (400252).

4. 2235 pairs mapped to a different chromosome. Look for "pairs on different chromosomes"

5. The mean insert size is 276.4 and the standard deviation is 46.9. Look for "insert size mean" and "insert size standard deviation".

6. 282478 reads were properly paired. Look for "reads properly paired".

7. 23,803 (7.9%) of the reads have zero mapping quality. Look for "zero MQ" in the "Reads" section.

8. The forward reads. Look at the "Quality per cycle" graphs.