See Jake Vanderplas' intro to NumPy for info on NumPy I/O.
Instructions: Create a new notebook called DataIOExercises
in your DataIO
directory and solve the following problems inside it. Be sure to include the problem statements in a markdown cell above your solution. You don't need to put the "helper" code in the markdown cell, just implement the helper code in your code cell with your solution.
Preliminaries: At the top of your notebook, include a "Heading 1" cell with the title Data I/O Exercises. Then include the SciPy and NumPy inline functions by adding a code cell that invokes the %pylab inline
magic:
In [ ]:
%pylab inline
import numpy as np
Here is some code that creates a comma-delimited file of numbers with random precision, leading spaces, and formatting:
In [ ]:
# Don't modify this: it simply writes the example file
f = open('messy_data.dat', 'w')
import random
for i in range(100):
for j in range(5):
f.write(' ' * random.randint(0, 6))
f.write('%0*.*g' % (random.randint(8, 12),
random.randint(5, 10),
100 * random.random()))
if j != 4:
f.write(',')
f.write('\n')
f.close()
In [ ]:
# Look at the first four lines of the file:
!head -4 messy_data.dat
(a) Write a program that reads in the contents of "messy_data.dat
" and extracts the numbers from each line, using the string manipulations we used in section 1 (remember that float()
will convert a suitable string to a floating-point number).
In [ ]:
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell
(b) Next write out a new file named "clean_data.dat
". The new file should contain the same data as the old file, but with uniform formatting and aligned columns.
In [ ]:
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell
(c) Now re-do the same task using NumPy's loadtxt
and savetxt
functions.
In [ ]:
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell