Lecture 3

I/O and exceptions

About Files

  • RAM and volatility.
  • Files and non-volatility ## Writing a file
  • file handle
  • mode

In [ ]:
myfile = open("test.txt", "w")
myfile.write("My first file written from Python\n")
myfile.write("---------------------------------\n")
myfile.write("Hello, world!\n")
myfile.write("Did it work?\n")
myfile.close()

Reading a whole file at once


In [ ]:
f = open("test.txt")
content = f.read()
f.close()

words = content.split()
print("There are {0} words in the file.".format(len(words)))
words

A better way to read files:

with open("somefile.txt" , 'r') as f:
    do stuff here...

In [5]:
import urllib.request
with urllib.request.urlopen('http://www.python.org/') as f:
    print(f.read(1000))


b'<!doctype html>\n<!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->\n<!--[if IE 7]>      <html class="no-js ie7 lt-ie8 lt-ie9">          <![endif]-->\n<!--[if IE 8]>      <html class="no-js ie8 lt-ie9">                 <![endif]-->\n<!--[if gt IE 8]><!--><html class="no-js" lang="en" dir="ltr">  <!--<![endif]-->\n\n<head>\n    <meta charset="utf-8">\n    <meta http-equiv="X-UA-Compatible" content="IE=edge">\n\n    <link rel="prefetch" href="//ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js">\n\n    <meta name="application-name" content="Python.org">\n    <meta name="msapplication-tooltip" content="The official home of the Python Programming Language">\n    <meta name="apple-mobile-web-app-title" content="Python.org">\n    <meta name="apple-mobile-web-app-capable" content="yes">\n    <meta name="apple-mobile-web-app-status-bar-style" content="black">\n\n    <meta name="viewport" content="width=device-width, initial-scale=1.0">\n    <meta name="HandheldFriendly" conte'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-5b1559d3c854> in <module>()
      1 import urllib.request
      2 with urllib.request.urlopen('http://www.python.org/') as f:
----> 3     print(f.read(1000)).decode()

AttributeError: 'NoneType' object has no attribute 'decode'

Fetching from the web

import urllib.request

url = "https://api.github.com/"
destination_filename = "rfc793.txt"

urllib.request.urlretrieve(url, destination_filename)

We’ll need to get a few things right before this works:

  • The resource we’re trying to fetch must exist! Check this using a browser.
  • We’ll need permission to write to the destination filename, and the file will be created in the “current directory” - i.e. the same folder that the Python program is saved in.
  • If we are behind a proxy server that requires authentication, (as some students are), this may require some more special handling to work around our proxy. Use a local resource for the purpose of this demonstration!

In [13]:
import urllib.request

def retrieve_page(url):
    """ Retrieve the contents of a web page.
        The contents is converted to a string before returning it.
    """
    with urllib.request.urlopen(url) as my_socket:
        dta = str(my_socket.read())
        return dta

f = open("git.txt", "w")

the_text = retrieve_page("https://reddit.com")
# print(the_text)

Exercises

  1. Write a program that reads a file and writes out a new file with the lines in reversed order (i.e. the first line in the old file becomes the last one in the new file.)

  2. Write a program that reads a file and prints only those lines that contain the substring snake.

  3. Write a program that reads a text file and produces an output file which is a copy of the file, except the first five columns of each line contain a four digit line number, followed by a space. Start numbering the first line in the output file at 1. Ensure that every line number is formatted to the same width in the output file. Use one of your Python programs as test data for this exercise: your output should be a printed and numbered listing of the Python program.

  4. Write a program that undoes the numbering of the previous exercise: it should read a file with numbered lines and produce another file without line numbers.

Extra: Read through the requests tutorial. Requests is a much better tool for working with HTTP requests than the built in urllib.requests IMHO (it's even recommended in the urllib.requests library!).

Pair project: Boggler

Choosing pairs


In [33]:
code_class = ['dana', 'cole', 'kevin', 'connor', 'jaydn', 'patrick', 
        'ransom', 'skip', 'mercy', 'nick']

rand_class = code_class.copy()
random.shuffle(rand_class)

list(zip(rand_class, rand_class[::-1]))


Out[33]:
[('kevin', 'ransom'),
 ('patrick', 'dana'),
 ('connor', 'cole'),
 ('mercy', 'jaydn'),
 ('skip', 'nick'),
 ('nick', 'skip'),
 ('jaydn', 'mercy'),
 ('cole', 'connor'),
 ('dana', 'patrick'),
 ('ransom', 'kevin')]

In [ ]: