Data I/O Exercises

See Jake Vanderplas' intro to NumPy for info on NumPy I/O.


Instructions: Create a new notebook called DataIOExercises in your DataIO directory and solve the following problems inside it. Be sure to include the problem statements in a markdown cell above your solution. You don't need to put the "helper" code in the markdown cell, just implement the helper code in your code cell with your solution.

Preliminaries: At the top of your notebook, include a "Heading 1" cell with the title Data I/O Exercises. Then include the SciPy and NumPy inline functions by adding a code cell that invokes the %pylab inline magic:



In [ ]:
%pylab inline
import numpy as np

Question 1

Here is some code that creates a comma-delimited file of numbers with random precision, leading spaces, and formatting:


In [ ]:
# Don't modify this: it simply writes the example file
f = open('messy_data.dat', 'w')
import random
for i in range(100):
    for j in range(5):
        f.write(' ' * random.randint(0, 6))
        f.write('%0*.*g' % (random.randint(8, 12),
                            random.randint(5, 10),
                            100 * random.random()))
        if j != 4:
            f.write(',')
    f.write('\n')
f.close()

In [ ]:
# Look at the first four lines of the file:
!head -4 messy_data.dat

(a) Write a program that reads in the contents of "messy_data.dat" and extracts the numbers from each line, using the string manipulations we used in section 1 (remember that float() will convert a suitable string to a floating-point number).


In [ ]:
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

(b) Next write out a new file named "clean_data.dat". The new file should contain the same data as the old file, but with uniform formatting and aligned columns.


In [ ]:
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

(c) Now re-do the same task using NumPy's loadtxt and savetxt functions.


In [ ]:
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

All content is under a modified MIT License, and can be freely used and adapted. See the full license text here.