Week 6 Homework


Source: Python for Biologists

In this folder you’ll find a text file called data.csv, containing some made-up data for a number of genes. Each line contains the following fields for a single gene in this order: species name, sequence, gene name, expression level. The fields are separated by commas (hence the name of the file – csv stands for Comma Separated Values). Think of it as a representation of a table in a spreadsheet – each line is a row, and each field in a line is a column. All the exercises for this section use the data read from this file.

Several species

Print out the gene names for all genes belonging to Drosophila melanogaster or Drosophila simulans.


In [ ]:

Length range

Print out the gene names for all genes between 90 and 110 bases long.


In [ ]:

AT content

Print out the gene names for all genes whose AT content is less than 0.5 and whose expression level is greater than 200.


In [ ]:

Complex condition

Print out the gene names for all genes whose name begins with “k” or “h” except those belonging to Drosophila melanogaster.


In [ ]:

High low medium

For each gene, print out a message giving the gene name and saying whether its AT content is high (greater than 0.65), low (less than 0.45) or medium (between 0.45 and 0.65).


In [ ]: