Programming Bootcamp 2016

Lesson 3 Exercises


Earning points (optional)

  • Enter your name below.
  • Email your .ipynb file to me (sarahmid@mail.med.upenn.edu) before 9:00 am on 9/16.
  • You do not need to complete all the problems to get points.
  • I will give partial credit for effort when possible.
  • At the end of the course, everyone who gets at least 90% of the total points will get a prize (bootcamp mug!).

Name:


1. Guess the output: loop practice (1pt)

For the following blocks of code, first try to guess what the output will be, and then run the code yourself. Points will be given for filling in the guesses; guessing wrong won't be penalized.


In [ ]:
for i in range(1, 10, 2):
    print i

Your guess:


In [ ]:
for i in range (5, 1, -1):
    print i

Your guess:


In [ ]:
count = 0
while (count < 5):
    print count
    count = count + 1

Your guess:


In [ ]:
total = 0
for i in range(4):
    total = total + i
print total

Your guess:


In [ ]:
name = "Mits"
for i in name:
    print i

Your guess:


In [ ]:
name = "Wilfred"
newName = ""
for letter in name:
    newName = newName + letter
print newName

Your guess:


In [ ]:
name = "Wilfred"
newName = ""
for letter in name:
    newName = letter + newName
print newName

Your guess:


In [ ]:
seq = "AGCTGATGC"
count = 0
for letter in seq:
    count = count + 1
print count

Your guess:


In [ ]:
seq = "AGCTGATGC"
count = 0
for letter in seq:
    if letter == "T":
        count = count + 1
print count

Your guess:


2. Spot the endless loop (1pt)

For the following examples, first guess whether or not the loop will be endless. Then run the code to find out (if it doesn't stop within a few seconds, you can assume it's endless).

NOTE: If you hit an endless loop, you will not be able to run anything else until you stop it! Use the kernel interrupt button (square button up top) to stop the execution of the loop.


In [ ]:
count = 0
while count < 5:
    print count
print "Done"

Endless loop or not? Your guess:


In [ ]:
count = 0
while count > 0:
    print count
print "Done"

Endless loop or not? Your guess:


In [ ]:
count = 0
while count < 10:
    print count
count = count + 1
print "Done"

Endless loop or not? Your guess:


In [ ]:
count = 10
while count > 0:
    count = count - 1
print "Done"

Endless loop or not? Your guess:


In [ ]:
a = True
count = 0
while a:
    count = count + 1
print "Done"

Endless loop or not? Your guess:


In [ ]:
x = 1
while x != 100:
    x = x + 5
print "Done"

Endless loop or not? Your guess:


In [ ]:
x = 1
while x <= 100:
    x = x + 5
print "Done"

Endless loop or not? Your guess:


3. Simple loop practice (4pts)

Write code to accomplish each of the following tasks using a for loop or a while loop. Choose whichever type of loop you want for each problem (you can try both, if you want extra practice).

(A) (1pt) Print the integers between 8 and 33, inclusive.


In [ ]:

(B) (1pt) Starting with x = 1, double x until it's greater than 1000. Print each value of x as you go along.


In [ ]:

(C) (1pt) Print the positive integers less than 500 that are multiples of 13.


In [ ]:

(D) (1pt) Print each character of the string "AGTAATCGCGATGAATACCATCGCAGCC" on a separate line.


In [ ]:


4. File reading and processing (6pts)

For these problems, use the file sequences.txt provided on Piazza. This file contains several DNA sequences of different lengths. You can assume each sequence is on a separate line.

Note: I recommend saving sequences.txt in the same directory as this notebook to make things easier.

(A) (1pt) Using a loop, read in each sequence from the file and print it. Make sure to remove any newline characters (\n) while reading in the data.


In [ ]:

(B) (1pt) Now, instead of printing the sequences themselves, print the length of each sequence. At the end, print the average length of the sequences.

Hint: use the concept of an "accumulator" variable to help with computing the average, and watch out for integer division!

[ Check your answer ] You should get 77.56 as the average.


In [ ]:

(C) (2pts) Instead of printing lengths, print the GC content of each sequence (GC content is the number of G's and C's in a DNA sequence divided by the total sequence length). At the end print the average GC content.

[ Check your answer ] You should get ~0.48 as the average.


In [ ]:

(D) (2pts) Convert each sequence to its reverse complement and print it. This means changing each nucleotide to its complement (A->T, T->A, G->C, C->G) and reversing the entire sequence.

Hint: we've already touched on everything you need to know to do this -- see problem 1 above for some clues!

[ Check your answer ] Spot check this by comparing at least one sequence from the file to its reverse complement and make sure it looks correct.


In [ ]:


5. Guessing game (2pts)

Write code that plays a number guessing game with the user. The code will loop until the user gets the number right or quits. Follow the directions below.

  • First, have your program generate a random integer between 1 and 20 (the "secret number").
  • Then prompt the user "Guess a number between 1 and 20 (enter 0 to quit): "
  • Read in their answer with raw_input() (recall that you'll need to convert the return value to an int) and save it in a variable
  • Compare their guess to your "secret number":
    • If the guess is correct, print "You got it!" and end the loop.
    • If the guess is higher than the secret number, print "Too high!" and allow the user to keep guessing.
    • If the guess is lower than the secret number, print "Too low!" and allow the user to keep guessing.
    • If they entered 0, print "Ending program" and end the loop.

Tip: It will be easier for you to test your program if you initially set the secret number to be a number of your choosing instead of something random. Then when you run the program, you will know if the responses are correct. Make sure to test all possibilities (higher, lower, equal, zero). Once you're sure the logic is correct, you can change the secret number to random generation.


In [ ]:


6. Family simulation (5pts)

Use a simulation to determine the average number of children in a family if all families had children until they had a girl (and then stopped). Assume equal probability of girls and boys and use a random generator to simulate each birth. Simulate 10,000 families and output the average number of children they had. Your answer should be close to 2.

Note: I'm purposely not giving any step by step instructions here because I'd like you to practice translating a word problem into code. It may help if you try break down the problem into smaller parts first. For example, see if you can simulate just one family first, then once that's working, add code to make it run through 10,000 families. Alternatively, you might feel more comfortable writing code to print "hi" 10,000 times, and then replace "print hi" with code for simulating each family. Do whatever makes the problem feel more manageable to you. Learning to break down big problems into smaller parts like this is an essential skill for programming, so I encourage you to work through this. If you get it, then I think it's safe to say you've mastered the material in this lesson and are well on your way to becoming a true programmer!

I'll give partial credit for good efforts!


In [ ]:


Extra Problems (0pts)

The following problems are for people who would like more practice. They will not be counted for points.

(A) Computing factorials. The factorial of a number n is defined as:

n! = n * (n - 1) * (n - 2) * ... * 1

Prompt the user to enter a positive integer and then output the factorial of that number. Do not use any modules (i.e. don't use the factorial function in the math module)!


In [ ]:

(B) Fibonacci sequence. The Fibonacci sequence is a series where each number is the sum of pervious two numbers. The first two numbers are always 0 and 1. So for example, the first 10 numbers of the series are: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34. Prompt the user to enter a positive integer, and then output that number of terms in the Fibonacci sequence.


In [ ]:

(C) Counting mutations. Prompt the user to input two DNA sequences of the same length. Count up the number of nucleotides that differ between the sequences and output this number.

[ Check your answer ] Try the following to check that your program is correct:
Seq1 = AAGTCGTACA
Seq2 = AAGTCGGACG
Num differences: 2

Seq1 = GTGTGATGAGCGCGACA
Seq2 = GAAAGATGAGCGTGTCA
Num differences: 5


In [ ]:

(D) Exact motif search. Prompt the user to input a "reference" sequence (a longish DNA sequence) and a "query" (a shorter DNA sequence). Print out the locations (nt position) of all exact matches (if any) of the query within the reference.

There are shortcuts for doing this in Python, such as the .find() function, but see if you can do it without such things.

[ Check your answer ]
Reference: ATGCGCTAAAGCGCTAGATCTCTAGCTAAAGCTAGCTTATTCGGATGGGCTAG
Query: AAAGC
Matches at position 7 and 27 within the reference (counting starting at 0)


In [ ]: