Question 1:

For basic operation using list, tuple and dictionaries

1.1

a. Finds the largest of a list of numbers.
Hint: Set up the list as the first line in your program.


In [9]:
maxNumber = 0
numberList = [15,4,26,1,9,21,3,6,13]
for each in numberList:
    if each>maxNumber:
        maxNumber = each
print("The largest number in the list is {0}".format(maxNumber))


The largest number in the list is 26

1.1

b. Find the average of a list of numbers using a for loop


In [13]:
runningTotal = 0
listOfNumbers = [4,7,9,1,8,6]
for each in listOfNumbers:
    runningTotal = runningTotal + each
    # each time round the loop add the next item to the running total
    average = runningTotal/len(listOfNumbers)
    # the average is the runningTotal at the end / how many numbers
print(listOfNumbers)
print("The average of these numbers is {0:.2f}".format(average))


[4, 7, 9, 1, 8, 6]
The average of these numbers is 5.83

1.1

c. Write a program that prints string in reverse. Print character by character

Input: Python Expected Output: n o h t y P

Hint: can use range, len functions
P y t h o n 0 1 2 3 4 5 -6 -5 -4 -3 -2 -1


In [3]:
word = "Python"
#print(len(word))
for char in range(len(word) - 1, -1, -1): # range(start=5, end=-1, de -1)
    print(word[char])


n
o
h
t
y
P

1.2

Write a Python program to count the number of even and odd numbers from a series of numbers.
Hint: review of for loop, if statements


In [65]:
numbers = (1, 2, 3, 4, 5, 6, 7, 8, 9) # Declaring the tuple
count_odd = 0
count_even = 0
#type your code here
for x in numbers:
        if not x % 2:
            count_even+=1
        else:
            count_odd+=1
print("Number of even numbers :",count_even)
print("Number of odd numbers :",count_odd)


Number of even numbers : 4
Number of odd numbers : 5

1.3

Check if given list of strings have ECORI site motif and print value that doesn't contain the motif until two strings with the motif are found motif = "GAATTC" (5' for ECORI restriction site)

Output:
AGTGAACCGTCAGATCCGCTAGCGCGAATTC doesn't contain the motif GGAGACCGACACCCTCCTGCTATGGGTGCTGCTGCTC doesn't contain the motif TGGGTGCCCGGCAGCACCGGCGACGCACCGGTCGC doesn't contain the motif CACCATGGTGAGCAAGGGCGAGGAGAATAACATGGCC doesn't contain the motif Two strings in given list contain the motif


In [14]:
motif = "GAATTC"
count = 0
dna_strings = ['AGTGAACCGTCAGATCCGCTAGCGCGAATTC','GGAGACCGACACCCTCCTGCTATGGGTGCTGCTGCTC','TGGGTGCCCGGCAGCACCGGCGACGCACCGGTCGC',
               'CACCATGGTGAGCAAGGGCGAGGAGAATAACATGGCC','ATCATCAAGGAGTTCATGCGCTTCAAGAATTC','CATGGAGGGCTCCGTGAACGGCCACGAGTTCGAGA'
               ,'TCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGCCTT']
#type your code
for item in dna_strings:
    if(item.find(motif) >= 1):
        count+=1
    if(count==2):
        print("Two strings in given list contain the motif")
        break;
    else:
        print(item ,': doesn\'t contain the motif')


AGTGAACCGTCAGATCCGCTAGCGCGAATTC : doesn't contain the motif
GGAGACCGACACCCTCCTGCTATGGGTGCTGCTGCTC : doesn't contain the motif
TGGGTGCCCGGCAGCACCGGCGACGCACCGGTCGC : doesn't contain the motif
CACCATGGTGAGCAAGGGCGAGGAGAATAACATGGCC : doesn't contain the motif
Two strings in given list contain the motif

1.4

Write a Python program that prints all the numbers in range from 0 to 10 except 5 and 10.
Hint: use continue


In [16]:
#type your code here
for value in range(10):
    if (value == 5 or value==10):
        continue
    print(value,end=' ')
print("\n")


0 1 2 3 4 6 7 8 9 

1.5 (Multi-Part)

Next series of tasks about lists and list manipulations:
a. Create a list of 5 of your favorite things.


In [17]:
my_favorites=['Music', 'Movies', 'Coding', 'Biology', 'Python']

b. Use the print() function to print your list.


In [18]:
print(my_favorites)


['Music', 'Movies', 'Coding', 'Biology', 'Python']

c. Use the print() function to print out the middle element.


In [19]:
print(my_favorites[2])


Coding

d. Now replace the middle element with a different item, your favorite song, or song bird.


In [71]:
my_favorites[2]='European robin'

e. Use the same print statement from b. to print your new list. Check out the differences.


In [72]:
print(my_favorites)


['Music', 'Movies', 'European robin', 'Biology', 'Python']

f. Add a new element to the end. Read about append().


In [74]:
my_favorites.append('Monkeys')

g. Add a new element to the beginning. Read about insert().


In [75]:
my_favorites.insert(0, 'Evolution')

h. Add a new element somewhere other than the beginning or the end.


In [76]:
my_favorites.insert(3, 'Coffee')

1.6

Write a script that splits a string into a list:

  • Save the string sapiens, erectus, neanderthalensis as a variable.
  • Print the string.
  • Split the string into individual words and print the result of the split. (Think about the ', '.)
  • Store the resulting list in a new variable.
  • Print the list.
  • Sort the list alphabetically and print (hint: lookup the function sorted()).
  • Sort the list by length of each string and print. (The shortest string should be first). Check out documentation of the key argument.`

In [20]:
#type your code
hominins='sapiens, erectus, neanderthalensis'
print(hominins)
hominin_individuals=hominins.split(',')
print('hominin_individuals')
hominin_individuals=sorted(hominin_individuals)
print("List: ", hominin_individuals)
hominin_individuals=sorted(hominin_individuals, key=len)
print(hominin_individuals)


sapiens, erectus, neanderthalensis
hominin_individuals
List:  [' erectus', ' neanderthalensis', 'sapiens']
['sapiens', ' erectus', ' neanderthalensis']

Extra Pratice: 1.7

Use list comprehension to generate a list of tuples. The tuples should contain sequences and lengths from the previous problem. Print out the length and the sequence (i.e., "4\tATGC\n").


In [21]:
sequences=['ATGCCCGGCCCGGC','GCGTGCTAGCAATACGATAAACCGG', 'ATATATATCGAT','ATGGGCCC']
seq_lengths=[(seq, len(seq)) for seq in sequences]
print(seq_lengths)


[('ATGCCCGGCCCGGC', 14), ('GCGTGCTAGCAATACGATAAACCGG', 25), ('ATATATATCGAT', 12), ('ATGGGCCC', 8)]

Question 2: Dictionaries and Set

2.1

Create a dictionary store DNA restriction enzyme names and their motifs from:
https://www.neb.com/tools-and-resources/selection-charts/alphabetized-list-of-recognition-specificities
eg: EcoRI = GAATTC AvaII = GGACC BisI = GGACC


In [22]:
enzymes = { 'EcoRI':'GAATTC','AvaII':'GGACC', 'BisI':'GCATGCGC' , 'SacII': r'CCGCGG','BamHI': 'GGATCC'}
print(enzymes)


{'EcoRI': 'GAATTC', 'AvaII': 'GGACC', 'BisI': 'GCATGCGC', 'SacII': 'CCGCGG', 'BamHI': 'GGATCC'}

a. look up the motif for a particular SacII enzyme


In [23]:
print(enzymes['SacII'])


CCGCGG

b. add below two enzymes and their motifs to dictionary
KasI: GGCGCC AscI: GGCGCGCC EciI: GGCGGA


In [24]:
enzymes['KasI'] = 'GGCGCC'
enzymes['AscI'] = 'GGCGCGCC'
print(enzymes)


{'EcoRI': 'GAATTC', 'AvaII': 'GGACC', 'BisI': 'GCATGCGC', 'SacII': 'CCGCGG', 'BamHI': 'GGATCC', 'KasI': 'GGCGCC', 'AscI': 'GGCGCGCC'}

2.2

Suppose dna is a string variable that contains only 'A','C','G' or 'T' characters. Write code to find and print the character and frequency (max_freq) of the most frequent character in string dna?


In [26]:
dna = 'AAATTCGTGACTGTAA'
dna_counts= {'T':dna.count('T'),'C':dna.count('C'),'G':dna.count('G'),'A':dna.count('A')}
print(dna_counts)
max_freq= sorted(dna_counts.values())[-1]
print(max_freq)


{'T': 5, 'C': 2, 'G': 3, 'A': 6}
6

2.3

If you create a set using a DNA sequence, what will you get back? Try it with this sequence:

GATGGGATTGGGGTTTTCCCCTCCCATGTGCTCAAGACTGGCGCTAAAAGTTTTGAGCTTCTCAAAAGTCTAGAGCCACCGTCCAGGGAGCAGGTAGCTGCTGGGCTCCGGGGACACTTTGCGTTCGGGCTGGGAGCGTGCTTTCCACGACGGTGACACGCTTCCCTGGATTGGCAGCCAGACTGCCTTCCGGGTCACTGCCATGGAGGAGCCGCAGTCAGATCCTAGCGTCGAGCCCCCTCTGAGTCAGGAAACATTTTCAGACCTATGGAAACTACTTCCTGAAAACAACGTTCTGTCCCCCTTGCCGTCCCAAGCAATGGATGATTTGATGCTGTCCCCGGACGATATTGAACAATGGTTCACTGAAGACCCAGGTCCAGATGAAGCTCCCAGAATTCGCCAGAGGCTGCTCCCCCCGTGGCCCCTGCACCAGCAGCTCCTACACCGGCGGCCCCTGCACCAGCCCCCTCCTGGCCCCTGTCATCTTCTGTCCCTTCCCAGAAAACCTACCAGGGCAGCTACGGTTTCCGTCTGGGCTTCTTGCATTCTGGGACAGCCAAGTCTGTGACTTGCACGTACTCCCCTGCCCTCAACAAGATGTTTTGCCAACTGGCCAAGACCTGCCCTGTGCAGCTGTGGGTTGATTCCACACCCCCGCCCGGCACCCGCGTCCGCGCCATGGCCATCTACAAGCAGTCACAGCACATGACGGAGGTTGTGAGGCGCTGCCCCCACCATGAGCGCTGCTCAGATAGCGATGGTCTGGCCCCTCCTCAGCATCTTATCCGAGTGGAAGGAAATTTGCGTGTGGAGTATTTGGATGACAGAAACACTTTTCGTGGGGTTTTCCCCTCCCATGTGCTCAAGACTGGCGCTAAAAGTTTTGAGCTTCTCAAAAGTCTAGAGCCACCGTCCAGGGAGCAGGTAGCTGCTGGGCTCCGGGGACACTTTGCGTTCGGGCTGGGAGCGTGCTTTCCACGACGGTGACACGCTTCCCTGGATTGGCAGCCAGACTGCCTTCCGGGTCACTGCCATGGAGGAGCCGCAGTCAGATCCTAGCGTCGAGCCCCCTCTGAGTCAGGAAACATTTTCAGACCTATGGAAACTACTTCCTGAAAACAACGTTCTGTCCCCCTTGCCGTCCCAAGCAATGGATGATTTGATGCTGTCCCCGGACGATATTGAACAATGGTTCACTGAAGACCCAGGTCCAGATGAAGCTCCCAGAATTCGCCAGAGGCTGCTCCCCCCGTGGCCCCTGCACCAGCAGCTCCTACACCGGCGGCCCCTGCACCAGCCCCCTCCTGGCCCCTGTCATCTTCTGTCCCTTCCCAGAAAACCTACCAGGGCAGCTACGGTTTCCGTCTGGGCTTCTTGCATTCTGGGACAGCCAAGTCTGTGACTTGCACGTACTCCCCTGCCCTCAACAAGATGTTTTGCCAACTGGCCAAGACCTGCCCTGTGCAGCTGTGGGTTGATTCCACACCCCCGCCCGGCACCCGCGTCCGCGCCATGGCCATCTACAAGCAGTCACAGCACATGACGGAGGTTGTGAGGCGCTGCCCCCACCATGAGCGCTGCTCAGATAGCGATGGTCTGGCCCCTCCTCAGCATCTTATCCGAGTGGAAGGAAATTTGCGTGTGGAGTATTTGGATGAC


In [83]:
DNA='GATGGGATTGGGGTTTTCCCCTCCCATGTGCTCAAGACTGGCGCTAAAAGTTTTGAGCTTCTCAAAAGTCTAGAGCCACCGTCCAGGGAGCAGGTAGCTGCTGGGCTCCGGGGACACTTTGCGTTCGGGCTGGGAGCGTGCTTTCCACGACGGTGACACGCTTCCCTGGATTGGCAGCCAGACTGCCTTCCGGGTCACTGCCATGGAGGAGCCGCAGTCAGATCCTAGCGTCGAGCCCCCTCTGAGTCAGGAAACATTTTCAGACCTATGGAAACTACTTCCTGAAAACAACGTTCTGTCCCCCTTGCCGTCCCAAGCAATGGATGATTTGATGCTGTCCCCGGACGATATTGAACAATGGTTCACTGAAGACCCAGGTCCAGATGAAGCTCCCAGAATTCGCCAGAGGCTGCTCCCCCCGTGGCCCCTGCACCAGCAGCTCCTACACCGGCGGCCCCTGCACCAGCCCCCTCCTGGCCCCTGTCATCTTCTGTCCCTTCCCAGAAAACCTACCAGGGCAGCTACGGTTTCCGTCTGGGCTTCTTGCATTCTGGGACAGCCAAGTCTGTGACTTGCACGTACTCCCCTGCCCTCAACAAGATGTTTTGCCAACTGGCCAAGACCTGCCCTGTGCAGCTGTGGGTTGATTCCACACCCCCGCCCGGCACCCGCGTCCGCGCCATGGCCATCTACAAGCAGTCACAGCACATGACGGAGGTTGTGAGGCGCTGCCCCCACCATGAGCGCTGCTCAGATAGCGATGGTCTGGCCCCTCCTCAGCATCTTATCCGAGTGGAAGGAAATTTGCGTGTGGAGTATTTGGATGACAGAAACACTTTTCGTGGGGTTTTCCCCTCCCATGTGCTCAAGACTGGCGCTAAAAGTTTTGAGCTTCTCAAAAGTCTAGAGCCACCGTCCAGGGAGCAGGTAGCTGCTGGGCTCCGGGGACACTTTGCGTTCGGGCTGGGAGCGTGCTTTCCACGACGGTGACACGCTTCCCTGGATTGGCAGCCAGACTGCCTTCCGGGTCACTGCCATGGAGGAGCCGCAGTCAGATCCTAGCGTCGAGCCCCCTCTGAGTCAGGAAACATTTTCAGACCTATGGAAACTACTTCCTGAAAACAACGTTCTGTCCCCCTTGCCGTCCCAAGCAATGGATGATTTGATGCTGTCCCCGGACGATATTGAACAATGGTTCACTGAAGACCCAGGTCCAGATGAAGCTCCCAGAATTCGCCAGAGGCTGCTCCCCCCGTGGCCCCTGCACCAGCAGCTCCTACACCGGCGGCCCCTGCACCAGCCCCCTCCTGGCCCCTGTCATCTTCTGTCCCTTCCCAGAAAACCTACCAGGGCAGCTACGGTTTCCGTCTGGGCTTCTTGCATTCTGGGACAGCCAAGTCTGTGACTTGCACGTACTCCCCTGCCCTCAACAAGATGTTTTGCCAACTGGCCAAGACCTGCCCTGTGCAGCTGTGGGTTGATTCCACACCCCCGCCCGGCACCCGCGTCCGCGCCATGGCCATCTACAAGCAGTCACAGCACATGACGGAGGTTGTGAGGCGCTGCCCCCACCATGAGCGCTGCTCAGATAGCGATGGTCTGGCCCCTCCTCAGCATCTTATCCGAGTGGAAGGAAATTTGCGTGTGGAGTATTTGGATGAC'
DNA_set = set(DNA)
print('DNA_set contains {}'.format(DNA_set))


DNA_set contains {'T', 'G', 'A', 'C'}