1. Simple loop practice
Write code to accomplish each of the following tasks using a for
loop or a while
loop. Choose whichever type of loop you want for each problem (you can try both, if you want extra practice). Note: you may want to refer to the Lesson 3 "extra material" for some hints on how to use range()
to make these problems easier.
(A) Print the integers between 3 and 35, inclusive.
In [0]:
(B) Print the positive integers less than 100 that are multiples of 7.
In [0]:
(C) Starting with x = 1, double x until it's greater than 1000. Print each value of x as you go along, including the last value that is greater than 1000.
In [0]:
(D) Print each character of the string "supercalifragilisticexpialidocious" on a separate line.
In [0]:
2. File reading practice
For these problems, use the file sequences.txt
provided with this document. This file contains several DNA sequences of different lengths. You can assume each sequence is on a separate line.
(A) Using a loop, read in each sequence from the file and print it. Make sure to remove any carriage returns (\r) and newline characters (\n) while reading in the data.
In [0]:
(B) Now, instead of printing the sequences, output the length of each sequence to the terminal screen. At the end, print the average length of the sequences. (You should get 77.56 as the average.)
Hint: use the concept of an "accumulator" variable to help with computing the average.
In [0]:
3. File writing practice
(A) Write a script that prints "Hello, world" to a file called hello.txt
In [0]:
(B) Write a script that prints the following pieces of data to a file called meow.txt
. Each piece of data must be printed to a separate line.
In [0]:
# data to be printed:
name = "Mitsworth"
age = 11
birthday = "9/1/04"
coloring = "Tabby"
livesRemaining = 8
# write your code here:
String manipulation 101
These problems follow from problem 2 above. Continue using the file sequences.txt
.
(A) Instead of printing lengths as before, print the GC content of each sequence (GC content is the number of G's and C's in a DNA sequence divided by the total sequence length). Make sure not to do integer division! Also, calculate the average GC content percentage across all the sequences. Implement this by calculating the GC content percentage for each individual sequence and then taking the average of these percentages as opposed to adding up the GC content across all sequences and computing the percentage at the end. You should get ~0.4877 as the average. (5 Points)
In [0]:
(B) Convert each sequence to its reverse complement. This means changing each nucleotide to its complement (A->T, T->A, G->C, C->G) and reversing the entire sequence. (5 Points)
Hint: we've already touched on everything you need to know to do this. See the practice problems from Lesson 3 for some hints on reversing..
Note: if you print out every nucleotide of your sequence on its own line, the IPython notebook will crash. If you want to print each base while developing your code, use the file "short_sequences.txt" that only has the first three sequences so that your notebook doesn't crash.
Note 2: you must print the reverse complement to the screen without using any intermediate files.
In [0]: