In [ ]:
from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/zV949buXdSg?autoplay=1&loop=1" frameborder="0" allowfullscreen></iframe>')

Structures like these are encoded in "PDB" files

How can we parse a complicted file like this one?


In [ ]:
import pandas as pd
pd.read_table("data/1stn.pdb")

We can do better by manually parsing the file.

Our test file

Predict what this will print


In [ ]:
f = open("test-file.txt")
print(f.readlines())
f.close()

Predict what this will print


In [ ]:
f = open("test-file.txt")
for line in f.readlines():
    print(line)
f.close()

Predict what this will print


In [ ]:
f = open("test-file.txt")
for line in f.readlines():
    print(line,end="")
f.close()

Basic file reading operations:

  • Open a file for reading: f = open(SOME_FILE_NAME)
  • Read lines of file sequentially: f.readlines()
  • Read one line from the file: f.readline()
  • Read the whole file into a string: f.read()
  • Close the file: f.close()

Now what do we do with each line?

Predict what the following program will do


In [ ]:
f = open("test-file.txt")
for line in f.readlines():
    print(line.split())
f.close()

Predict what the following program will do


In [ ]:
f = open("test-file.txt")
for line in f.readlines():
    print(line.split("1"))
f.close()

Splitting strings

  • SOME_STRING.split(CHAR_TO_SPLIT_ON) allows you to split strings into a list.
  • If CHAR_TO_SPLIT_ON is not defined, it will split on all whitespace (" ","\t","\n","\r")
  • "\t" is TAB, "\n" is NEWLINE, "\r" is CARRIAGE_RETURN.

Predict what the following will do


In [ ]:
f = open("test-file.txt")
lines = f.readlines()
f.close()

line_of_interest = lines[-1]
value = line_of_interest.split()[0]
print(value)

Predict what will happen:


In [ ]:
print(value*5)

value is a string of "1.5". You can't do math on it yet.

The solution is to cast it into a float


In [ ]:
value_as_float = float(value) 
print(value_as_float*5)

Cast calls:

float, int, str, list, tuple


In [ ]:
list("1.5")

Write a program that grabs the "1" from the first line in the file and multiplies it by 75.


In [ ]:

What about writing to files?

Basic file writing operations:

  • Open a file for writing: f = open(SOME_FILE_NAME,'w') will wipe out file immediately!
  • Open a file to append: f = open(SOME_FILE_NAME,'a')
  • Write a string to a file: f.write(SOME_STRING)
  • Write a list of strings: f.writelines([STRING1,STRING2,...])
  • Close the file: f.close()

In [ ]:
def file_printer(file_name):
    f = open(file_name)
    for line in f.readlines():
        print(line,end="")
    f.close()

Predict what this code will do


In [ ]:
a_list = ["a","b","c"]
f = open("another-file.txt","w")
for a in a_list:
    f.write(a)
f.close()
file_printer("another-file.txt")

Predict what this code will do


In [ ]:
a_list = ["a","b","c"]
f = open("another-file.txt","w")
for a in a_list:
    f.write(a)
    f.write("\n")
f.close()
file_printer("another-file.txt")

Predict what this code will do


In [ ]:
a_list = ["a","b","ccat"]
f = open("another-file.txt","w")
for a in a_list:
    f.write("A test {{}} {}\n".format(a))
f.close()
file_printer("another-file.txt")

format lets you make pretty strings


In [ ]:
print("The value is: {:}".format(10.35151))
print("The value is: {:.2f}".format(10.35151))
print("The value is: {:20.2f}".format(10.35151))

In [ ]:
print("The value is: {:}".format(10))
print("The value is: {:20d}".format(10))

String formatting

  • Pretty decimal printing: "{:LENGITH_OF_STRING.NUM_DECIMALSf}".format(FLOAT)
  • Pretty integer printing: "{:LENGTH_OF_STRINGd}".format(INT)
  • Pretty string printing: "{:LENGTH_OF_STRINGs}".format(STRING)

Create a loop that prints 0 to 9 to a file. Each number should be on its own line, written to 3 decimal places.


In [ ]:

Basic file reading operations:

  • Open a file for reading: f = open(SOME_FILE_NAME)
  • Read lines of file sequentially: f.readlines()
  • Read one line from the file: f.readline()
  • Read the whole file into a string: f.read()
  • Close the file: f.close()

Basic file writing operations:

  • Open a file for writing: f = open(SOME_FILE_NAME,'w') will wipe out file immediately!
  • Open a file to append: f = open(SOME_FILE_NAME,'a')
  • Write a string to a file: f.write(SOME_STRING)
  • Write a list of strings: f.writeline([STRING1,STRING2,...])
  • Close the file: f.close()

Splitting strings

  • SOME_STRING.split(CHAR_TO_SPLIT_ON) allows you to split strings into a list.
  • If CHAR_TO_SPLIT_ON is not defined, it will split on all whitespace (" ","\t","\n","\r")
  • "\t" is TAB, "\n" is NEWLINE, "\r" is CARRIAGE_RETURN.

String formatting

  • Pretty decimal printing: "{:LENGITH_OF_STRING.NUM_DECIMALSf}".format(FLOAT)
  • Pretty integer printing: "{:LENGTH_OF_STRINGd}".format(INT)
  • Pretty string printing: "{:LENGTH_OF_STRINGs}".format(STRING)