Welcome to the python bootcamp. This thing you're reading is called an *ipython notebook* and will be your first introduction to the Python programming language. Notebooks are a combination of text markup and code that you can run in real time.

It is often said that python comes with the *batteries included*, which means it comes with almost everything you need, bundled up in seperate *modules*. But not everything is loaded into memory automatically. You need to import the modules you need. Try running the following to import the antigravity module. Just hit `CTRL-ENTER`

in the cell to execute the code

```
In [ ]:
```import antigravity

*not* the reptile. Let's go ahead and import some useful modules you **will** need. Use this next command cell to import the modules named `os`

, `sys`

, and `numpy`

. You can import them one-by-one or all at once, separated by commas.

```
In [ ]:
```# Use this command box to run your own commands.
# By the way, Python ignores anything after a hash (#) symbol, so good for comments

The `os`

module gives you functions relating to the operating system, `sys`

gives you information on the python interpreter and your script, and `numpy`

is the mathematical powerhorse of python, allowing you to manipulate arrays of numbers.

Once you import a module, all the functions and data for that module live in its *namespace*. So if I wanted to use the `getcwd`

function from the `os`

module, I would have to refer to it as `os.getcwd`

. Try running `getcwd`

all by itself; you'll get an error. Then do it the correct way:

```
In [ ]:
```

`os.getcwd`

gets the current working directory. Try some commands for yourself. The `numpy`

module has many mathematical functions. Try computing the square root (`sqrt`

) of 2. You can also try computing $\sin(\pi)$ (numpy has a built-in value `numpy.pi`

) and even $e^{i\pi}$ (python uses the engineering convention of `1j`

as $\sqrt{-1}$).

```
In [ ]:
```

`isfile()`

function within the `path`

module, which is itself in the `os`

module. Give it a try:

```
In [ ]:
```

Here is a non-comprehensive list of modules you may find useful. Documentation for all of them can be found with a quick google search.

`os`

,`sys`

: As mentioned above, these give you access to the operating system, files, and environment.`numpy`

: Gives you arrays (vectors, matrices) and the ability to do math on them.`scipy`

: Think of this as "Numerical Recipes for Python". Root-finding, fitting functions, integration, special mathematical functions, etc.`pandas`

: Primarily used for reading/writing data tables. Useful for data wrangling.`astropy`

: Astronomy-related functions, including FITS image reading, astrometry, and cosmological calculations.

Almost everything in python has documentation. Just use the `help()`

function on anything in python to get information. Try running `help()`

on a function you used previously. Try as many others as you like.

```
In [ ]:
```

A variable is like a box that you put stuff (values) in for later use. In Python, there are lots of different `types`

of variables corresponding to what you put in. Unlike other languages, you don't have to tell `python`

what type each variable is: `python`

figures it out on its own. To put the value 1 into a box called x, just use the equals sign, like you would when solving a math problem.

```
In [ ]:
```x=1

`x`

using the `print()`

function. You can also modify the variable. Try adding or subtracting from x and print it out its value again.

```
In [ ]:
```print(x)
x=x+1
print(x)

`y`

. Now divide it by two and print out the result.

```
In [ ]:
```y = 1
y = y/2
print(y)

Here is where we get to the first major difference bewteen `python 2`

and `python 3`

. In `python 2`

, an integer divided by another integer is kept as an integer (this is simlar behavior to most other programming languages), so 1 divided by 2 is 0. In `python 3.x`

, division always produces a real number (called a `float`

), so 1 divided 2 is 0.5. If you want integer division in `python 3`

, use `//`

instead. This kind of division also works in `python 2`

, so it's worth getting used to it.

Repeat what you did before, but this time start by assigning the value of 1.0 to y first. Also try using integer division on a float variable.

```
In [ ]:
```

Remeber that variables are simply containers (or labels if you prefer). They don't have a fixed type. Try using the `type()`

function on `y`

.

There are other types of variables. The most commonly used are strings, lists, and arrays. But literally **anything** can be assigned to a variable. You can even assign the `numpy`

module to the variable `np`

if you don't like typing `numpy`

all the time. Try it out.

```
In [ ]:
```

`np.sqrt()`

rather than `numpy.sqrt()`

. Most python programmers do this with commonly-used modules. In fact, we usually just use the special form: `import numpy as np`

.

```
In [ ]:
```

Strings are collections of alpha-numeric characters and are the primary way data are represented outside your code. So you will find yourself manipulating strings in Python **ALL THE TIME**. You might need to convert strings from a text file into python data ojects to work with or you might need to do the opposite and generate an output file with final results that can be read by humans. Below are the most common things we need.

Strings are enclosed in matching singe (`'`

) or double (`"`

), or even triple (`'''`

) quotes. Python doesn't distinguish as long as you match them consistently. Triple-quoted strings can span many lines and are useful for literal text or code documentation.

```
In [ ]:
```# Strings are enclosed by single or double quotes
s='this is a string'
print (s)

We can use the `len()`

function to determine the length of the string. Try this out below.

```
In [ ]:
```

`python`

(we'll see that below). Each alpha-numeric character is an element in the string, and you can refer to individual elements (characters) using an index enclosed in square brackets (`[]`

). You can also specify a range of indices by separating them with a colon (`:`

). `Python`

indexes from 0 (not 1), so the first element is index `[0]`

, the second `[1]`

and so on. Negative indices count from the end of the string. Try printing out the 2nd character of your string `s`

, then the whole string except for the first and last characters.

```
In [ ]:
```

Specifying a range of indices (as well as more complicated indexing we'll see later) is called *slicing*. There is also a `string`

module that contains many functions for manipulating strings.

Sometimes you'll need your integers and floats to be converted into strings and written out into a table with specific formats (e.g., number of significant figures). This involves a syntax that's almost a separate language itself (though if you've used `C`

or `C++`

it will be very familiar). Here is a good reference: https://pyformat.info/

We'll cover the most important. First, if you just print out a regular floating point number, you get some arbitrary number of significant figures. The same is true if you just try to convert the float to a string using `str()`

, which takes any type of variable and tries to turn it into a string. Try printing the string value of `np.pi`

.

```
In [ ]:
```

`{}`

) and have special codes to specify how to format the variable. Without any other information, a simple `{}`

will be replaced with whatever `str()`

produces for the variable. For more control over numerical values, specify `:[width].[prec]f`

for floats and `:[width]d`

for integers. Replace `[width]`

and `[prec]`

with the total width you want your number to occupy and the number of digits after the decimal, respectively. Here's an example:

```
In [ ]:
```fmt = "This is a float with two decimals: {:.2f}"
print (fmt)

`.format()`

function.

```
In [ ]:
```# two decimal places
print(fmt.format(x))

`.format()`

function.

```
In [ ]:
```fmt = "Here is a float: '{:.2f}', and another '{:8.4f}', an integer {:d}, and a string {}"
print (fmt.format(x, np.pi, 1000, 'look ma, no quotes!'))

*new style* of string formatting, as it is the `python 3`

way of the future and is more powerful than the *old style*. Both styles are suppored in both versions of `python`

and the reference above has plenty of examples of both.

```
In [ ]:
```# A list of floats
x1=[1.,2.,7.,2500.]
print(x1)

```
In [ ]:
``````
# try making a list of strings. Use indexing to print out single elements and slices.
```

`x1`

above. Print it out to see what it looks like. Can you guess how to refer to an element of a list that's in another list?

```
In [ ]:
```

Numpy arrays allow for more functionality than lists. While they may also contain a mix of object types, you will primarily be working with numpy arrays that are comprised of numbers: either integers or floats. For example, you will at some point read in a table of data into a numpy array and do things to it, like add, multiply, etc.

Above, we imported the `numpy`

module as `np`

. We will use this to create arrays.

```
In [ ]:
```x=np.array([1.,2.,3.,4.])
print(x)

`x`

and print it out. Then try other mathematical functions from the `numpy`

module on it.

```
In [ ]:
```

Here is where the real power of numpy arrays comes into play. We can use `numpy`

to carry out all kinds of mathematical tasks that in other programming languages (like `C`

, `FORTRAN`

, etc) would require some kind of loop. Here are some of the most common tasks we'll use. By using `numpy`

functions on arrays of numbers, we speed up the code a lot. This is commonly referred to as *vectorizing* your code.

There are many functions in `numpy`

that allow you to make arrays from scratch.

We can create an array of zeros:

```
In [ ]:
```x1=np.zeros(5)
print(x1)

Take a guess at how to create a 5-element array of ones.

```
In [ ]:
```

`np.pi`

. How could you do that as a one-liner. Hint: vectorize.

```
In [ ]:
```

`np.arange(start,stop,step)`

, where you specify a number to start (inclusive), when to stop (non-inclusive), and what step size to have between each element. Make an array called `x1`

using `arange`

that goes from 0 to 4 inclusive.

```
In [ ]:
```

Now make an array called `x2`

that goes from 0 to 10 in steps of 2.

```
In [ ]:
```

`np.linspace(start,stop,N)`

, which gives you a specified number `N`

of elements equally spaced between `start`

and `stop`

. The `stop`

value in this case is inclusive (will be part of the sequence). Try making an array called `x3`

that goes from 0 to 8 and has 5 elements.

```
In [ ]:
```

```
In [ ]:
```x=np.ones((4,2))
print(x)
print(x.shape)

```
In [ ]:
```x4=0.5+x1+x2*x3/2.
print(x4)

What happens if you try to add `x1`

to the matrix `x`

you created above. Give it a try:

```
In [ ]:
```

`np.power`

. Take the base-10 log of an array and make sure it gives you what you expect. A shorthand for raising to a power is `**`

, for example, `2**3=8`

.

```
In [ ]:
```

```
In [ ]:
```# CCW rotation by 180 degrees
theta=np.pi/4
# Rotation matrix
x=np.array([[np.cos(theta),-np.sin(theta)],
[np.sin(theta),np.cos(theta)]])
print(x)
# Lets rotate (1,1) about the origin
y=np.array([1.,1.])
z=np.dot(x,y)
print(y)
print(z)

`numpy`

functions, `numpy.linalg`

, or member functions of the object itself

```
In [ ]:
```# Taking the transpose
x_t = x.T
print(x_t)
# Computing the inverse
x_i = np.linalg.inv(x)
# Matrix Multiplication
I = np.dot(x_i,x)
print (I)

Often, you need to access elements or sub-arrays within your array. This is referred to as *slicing*. We can select individual elements in an array using indices just as we did for strings (note that 0 is the first element and negative indices count backwards from the end). The most general slicing looks like `[start:stop:step]`

. Below, we create an array. Try to print out the following using slices:

- the first element
- the last element (there's two ways to do this)
- a sub-array from 3rd element to the end
- a sub-array with the last element stripped
- a sub-array with a single element (the last)
- a sub-aray with every second element
- a sub-array with all elements in reverse order

```
In [ ]:
```x=np.arange(5)
print(x)

`reshape`

to transform a 1D array into a 2D array with the same total number of elements. This is another handy way to create N-dimensional arrays.

```
In [ ]:
```x=np.arange(8)
print(x)
x=x.reshape((4,2))
print(x)
print(x[0,:])
print(x[:,0])

`reshape`

is `ravel`

, which flattens a multi-dimensional array into a 1D array. Try this on `x`

.

```
In [ ]:
```

So far, we've been running individual sets of commands and looking at their results immediately. Later, we will write a complete *program*, which is really just bundling up instructions into a recipe for solving a given task. But as the tasks we want to perform become more complicated, we need *control blocks*. These allow us to:

- Repeat tasks again and again (loops)
- Perform tasks only if certain conditions are met (if-else blocks)
- Group instructions into a single logical task (user-defined functions)

`python`

is rather unique in that it uses indenting to indicate the beginning and end of a logical block. This actually forces you to write readable code, which is a really good thing!

`for`

loops are useful for repeating a series of operations a given number of times. In `python`

, you loop over elements of a list or array. So if you want to loop over a sequence of integers (say, the indices of an array), then you would use the `range()`

function to generate the list of integers. You might also use the `len()`

function if you need the length of the array.

```
In [ ]:
```# range(n) creates a list of n elements
print(range(5))
# We can use it to iterate over a for loop
for ii in range(5):
print(ii)

`for`

loop to build up a list of elements by appending to an existing list (using its `append()`

member function). For example, to create the first N elements of the Fibonacci sequence:

```
In [ ]:
```fib = [1,1]
N = 100
for i in range(N-2):
fib.append(fib[-2]+fib[-1])
print (fib)

You may notice (if N is large enough) that you get numbers that have an `L`

at the end. `python`

has a special type called a *long integer* which allows for abitrarily large numbers.

Here is an example of what **not** to do with for loops (if you can help it). `For`

loops are more computationally expensive in python than using the `numpy`

functions do do the math. *Always* try to cast the problem in terms of `numpy`

math and functions. It will make your code faster. Try making N in the following example larger and larger and you'll see the difference.

```
In [ ]:
```# A slightly more complex example of a for loop:
import time
N = 100
x=np.arange(N)
y=np.zeros(N)
t1 = time.time() # start time
for ii in range(x.size):
y[ii]=x[ii]**2
t2 = time.time() # end time
print ("for loop took "+str(t2-t1)+" seconds")

Another way to implement the stopwatch is to use the iPython "magic" command `%time`

:

```
In [ ]:
```%time for ii in range(x.size): y[ii]=x[ii]**2

`N`

in the above code block bigger and see how the execution time goes up. Now do the exact same thing in the next code block, but use `numpy`

functions without a loop. See how the execution time improves.

```
In [ ]:
```

Similar to a for-loop, a while-loop executes the same code block repeatedly until a condition is no longer true. These are handy if you don't know ahead of time how long a loop will take, but you know you have to stop when a condition is true (or false). As an example, we can estimate the smallest floating point number (called machine-$\epsilon$) by continually dividing by 2 until we get zero.

```
In [ ]:
```Ns = [1.]
while Ns[-1] > 0:
Ns.append(Ns[-1]/2)
print Ns[-10:]

`for`

loops, you can have a never-ending loop if the condition is never false. Your computer is happy to keep grinding away forever. How could you safeguard against this?

```
In [ ]:
```x=5
if x==5:
print('Yes! x is 5')
# The two equal signs evaluate whether x is equal to 5. One can also use >, >=, <, <=, != (not equal to)

`=`

) and the logical comparison of two objects (`==`

).

```
In [ ]:
```x=5
if x==3:
print('Yes! x is 3')

`if-else`

statements execute the code in the `if`

block if the condition is `true`

, otherwise it executes the code in the `else`

block:

```
In [ ]:
```x=5
if x==3:
print('Yes! x is 3')
else:
print('x is not 3')
print('x is '+str(x))

One can also have a series of conditions in the form of an `elif`

block:

```
In [ ]:
```x=5
if x==2:
print('Yes! x is 2')
elif x==3:
print('Yes! x is 3')
elif x==4:
print('Yes! x is 4')
else:
print('x is '+str(x))

You can also have multiple conditions that are evaluated:

```
In [ ]:
```x=5
if x > 2 and x*2==10:
print('x is 5')
if x > 7 or x*2 == 10:
print('x is 5')

Try this. Use a `while`

loop and an `if-else`

block to generate the Collatz sequence. Start the list with any positive integer. To get the next element of the list, check if the current element is even or odd. If even, divide it by 2. If odd, multiply by 3 and add 1. The sequence ends if you get to 1. Print out the length of the list. The Collatz conjecture states that the sequence will always convert to 1 eventually regardless of the starting integer. The proof of this conjecture is one of the great unsolved problems in mathematics.

```
In [ ]:
```

Functions allow you to make a bundle of python statements that are executed whenever the function is called. Each function has arguments you pass in and value(s) that are returned. For example, you've been using the `print`

function. There are also some functions above that you have been using. Now we will make our own.

```
In [ ]:
```# the function 'myfunc' takes two numbers, x and y, adds them together and returns the results
def myfunc(x,y):
z=x+y
return z
# to call the function, we simply invoke the name and feed it the requisite inputs:
g=myfunc(2,3.)
print(g)

```
In [ ]:
```# you can set input parameters to have a default values
def myfunc2(x,y=5.):
z=x+y
return z
g=myfunc2(2.)
print(g)
g=myfunc2(2.,4.)
print(g)

`if`

, `for`

and `def`

together into one example. Take the code above that generates a Fibonacci sequence and put it into a function called `Fibonacci`

. The function should take one argument (the length of the sequence). It should check if N is less than 2 (which can't be done), or if N is greater than 1000 (which would take a very long time). If these conditions are met, print an error statement and return `None`

(a python special object that generally indicates something went wrong). Otherwise, compute and return the sequence.

```
In [ ]:
```

Give your function a test run. Make sure it behaves as it should.

```
In [ ]:
```Fibonacci(1)
Fibonacci(1001)
Fibonacci(10)

One of the major advantages of `python`

is a wealth of specialized packages for doing common scientific tasks. Sure, you could write your own least-squares fitter using what we've shown you so far, but before you attempt anything like that, take a little time to "google that" and see if a solution exists already.

You will have to `import`

Python modules/packages to carry out many of the tasks you will need for your research. As already discussed, `numpy`

is probably the most useful. `scipy`

and `astropy`

are other popular packages. Lets play around with a few of these to give you an idea of how useful they can be.

```
In [ ]:
```# I like to declare all of my imported packages at the top of my script so that I know what is available.
# Also note that there are many ways to import packages.
import numpy.random as npr # Random number generator
from scipy import stats # statistics functions
import scipy.interpolate as si # interpolation functions
from astropy.cosmology import FlatLambdaCDM # Cosmology in flat \Lambda-CDM universe

```
In [ ]:
```# Random numbers are useful for many tasks.
# draw 5 random numbers from a uniform distribution between 0 and 1:
x1=npr.uniform(0, 1, size=5)
print(x1)
# draw 5 random numbers from a normal distribution with mean 10 and standard
# deviation 0.5:
x2=npr.normal(10, 0.5, size=5)
print(x2)
# draw 10 random integers between 0 and 5(exclusive)
x3=npr.randint(0,5,10)
print(x3)

`sqrt(10)`

.

```
In [ ]:
```

Here is a practical example of using random numbers. Often in statistics, we have to compute the mean of some population based on a limited sample. For example, a survey may ask car drivers their age and make/model of car. A marketing team may want to know the average age of drivers of Ford Mustangs so they can target their audience. Calculating a mean is easy, but what about the uncertainty in that mean? You could compute the population standard deviation, but that pre-supposes the underlying distribution is Gaussian. Another method, that does not make any assumptions about the distribution is *bootstrapping*. Randomly remove values from the data and replace them with copies of other values. Compute a new mean. Do that N times and compute the standard deviation of these *bootstrapped* mean values.

Below is an example of bootstrapping a sample to determine the uncertainty on a measurement. In this case, we will compute the mean of a sample of ages, and the uncertainty on the mean.

```
In [ ]:
```# x below represents a measurement of the ages of N people, where N=x.size
x=np.array([19.,20.,22.,19.,21.,24.,35.,22.,21.])
# This is the mean age:
print(np.mean(x))
# Now we "bootstrap" to determine the error on this measurement:
ntrials=10000 # number of times we will draw a random sample of N ages
x_arr=np.zeros(ntrials) # store the mean of each random sample in this array
for ii in range(ntrials):
# draw N random integers, where N equals the number of samples in x
ix=npr.randint(0,x.size,x.size)
# subscript the original array with these random indices to get a new sample and compute the mean
x_arr[ii]=np.mean(x[ix])
# Finally, compute the standard deviation of the array of mean values to get the uncertainty on the *mean* age
print(np.std(x_arr))

```
In [ ]:
```# This is an example of binning data and computing a particular value for the data in each bin.
# The scipy package is used to carry out this task.
# Lets make some fake data of galaxies spanning random redshifts between 0<z<3:
z=npr.rand(10000)*3.
# And these galaxies have random stellar masses between 9<log(M/Msun)<12:
m=npr.rand(10000)*3.+9.
# Now we want to compute the median stellar mass for galaxies at 0<z<1, 1<z<2, and 2<z<3:
# So lets declalre the bin edges
bins=[0.,1.,2.,3.]
m_med,xbins,btemp = stats.binned_statistic(z,m,statistic='median',bins=bins)
print(bins)
print(m_med)

In science, we measure *discrete* values of data. Sometimes you need to interpolate between two (or more) points. A common example is drawing a smooth line through the data when making a graph. You could do this by hand using `numpy`

, but the module `scipy`

has an entire interpolation package that offers an easy solution.

```
In [ ]:
```# Interpolating between data points is another common task. We'll again use scipy to do some interpolating:
x=np.arange(5.)
y=x**2
print(x)
print(y)
# Linear interpolation
f=si.interp1d(x,y,kind='linear')
# si.interp1d returns a function, f, which can be used to feed values to.
# For example, lets evaluate f(x)
print(f(x))
# And now a different value
print(f(0.5))
# We can employ a higher order interpolation scheme to get more precise results (assuming a smoothly varying function)
f=si.interp1d(x,y,kind='quadratic')
print(f(x))
print(f(0.5))

Now we get into a really specific case. `Astropy`

is a collection of several packages that are very useful for astronomers. It is actively developed and has new stuff all the time. Here, we show how you can use the cosmology calculator to compute the age of the Universe given a redshift.

```
In [ ]:
```# The astropy package has all kinds of astronomy related routines.
# Here, we define a cosmology that allows to compute things like
# the age of the universe or Hubble constant at different redshifts
cosmo = FlatLambdaCDM(H0=70., Om0=0.3)
redshift=0.
print(cosmo.age(redshift))
redshift=[0,1,2,3]
print(cosmo.age(redshift))

There are many environments in which one can run Python code:

- iPython notebooks like this one are good for running quick snippets of code.
- Spyder (provided with Anaconda) provides a space for writing scripts, executing them, and also for easily looking up definitions of different functions. Very similar to the IDL graphical IDE.
- One can also write code in a plain text editor, like Emacs/Aquamacs. Then execute the code in a terminal running Python or iPython.

This is the most common and *agnostic* way to run your code. If you send your code to someone else, assume they will run it from the command line. If you are running your code on an HPC cluster, it needs to be run from the command-line. Lastly, writing code that runs with minimal user-interaction makes it more repeatable.

There are two aspects of writing command-line code that you should be familiar with: 1) getting arguments from the command line; and 2) working with files. The first is done through the `sys.argv`

variable, the second is done with `os.path`

package. Here we look at each briefly.

`sys.argv`

Quite simply, this is a list of the command-line arguments. You've seen several unix commands. Suppose we wanted to write the equivalent of the `cp`

command, but using a python script. Usually, you run the command like this from the command-line:

```
cp file1 file2
```

That would copy file1 to file2. So our python script will need to get both the source and destination file name. Here is how I would write a simple script to do the same thing as `cp`

:

```
import sys
f1 = sys.argv[1]
f2 = sys.argv[2]
print ("copying %s to %s" % (f1,f2))
# Here would be the code to actually copy one file to the other
```

Note: you can try to print out sys.argv on the next command-block. It will show you how this ipython notebook was actually run. But it is not very useful for doing anything practical.

With more complicated code, your command-line arguments may also get rather complicated (you may have optional arguments, switches, etc). There is a really good module in `python`

for dealing with such complicated arguments so that your script isn't filled with code just to deal with parsing `sys.argv`

. Have a look at the argparse module when you get to the point where you need to deal with complicated command-lines arguments.

```
In [ ]:
```print sys.argv

`os.path`

Up until now, we have been printing output to the screen. You can still do that with command-line scripts, but once you close down the terminal, that output is lost. Your code will need to write to output files, but likely will also have to read from files. `os.path`

gives you some functions that are useful when dealing with files.

You may have to check to see if a file exists, if a folder exists, etc.

```
In [ ]:
```# Check to see if a file exists
if os.path.isfile('/bin/ls'):
print("Oh good, you can list files")
if not os.path.isfile('test_output.dat'):
print ("It is safe to use this")
# Check if a folder exists
if os.path.isdir('/tmp'):
print ("you have a tmp folder")
# construct the path to a file using the correct separator
of = os.path.join('tmp','some','file')
print (of)

```
In [ ]:
```# Open a file for writing (note the 'w')
f = open('a_test_file', 'w')
# write a header, always a good idea!
f.write("# This is a test data file. It contains 3 columns\n")
for i in range(10):
f.write("%d %d %d\n" % (i, 2*i, 3*i))
# We need to "close" the file to make sure it is written to disk
f.close()

Now, depending what your current working directory is (see beginning of tutorial), there will be a file called `a_test_file`

in that folder. If you like, have a look at it using an editor or file viewer. It will have 10 rows and 3 columns. Note that we needed to have a "newline" (`\n`

) at the end of the string we used in the `write()`

function, otherwise the output would have been one long line.

Now, let's read the file back into python. You can either read the whole thing in at once as a single string, read it in line-by-line or read all lines in at once as a list.

```
In [ ]:
```# This time we use 'r' to indicate we only want to read the file
f = open('a_test_file', 'r')
everything = f.read()
# This brings us back to the beginning of the file
f.seek(0)
one_row = f.readline()
f.seek(0)
list_of_rows = f.readlines()
f.close()
print(list_of_rows)

Use `print()`

to have a look at the data we read in. Note that the rows will include the newline character (`\n`

). You can use the `string.strip()`

function to get rid of it.

In the next tutorial (visualization), we'll show you a better way to read in standard format data files like this. But sometimes you'll be faced with data that these more automatic functions can't handle, so it's good to know how to read it in by hand.

You almost never write a script without introducing bugs (the term comes from when computers were mechanical machines and insects literallly interfered with the running of the code). Luckily, python gives a very nice "traceback" report when it encounters a problem with what you've written. let's just generate a mistake on purpose and see what happens.

```
In [ ]:
```x1=np.arange(5)
x2=np.arange(3)
print(x1)
print(x2)
print(x1+x2)

This traceback is pretty short, but it shows exactly where the problem occurs (indicated with `---->`

). And the explanation is pretty clear (when you're used to `numpy`

arrays). You can't add two arrays unless they have the same shape.

These kinds of compile-time bugs are the easiest to fix. You've written something wrong (syntax error), or you've done some illegal operation (like above) and your code grinds to a halt. You fix the problem and then find another and so on until your code runs.

But then you might not get the "right" answer. Or your code does something unexpected. Or you get a "divide by zero" error and it's not completely obvious where things went wrong. That's when you need to do real debugging. There are several approaches to dealing with this:

- Print the values of variables throughout the program by simply injecting "print statements" in your code. This is easy to do and works "anywhere". Usually, with a few of these you can see the dumb mistake you made (but python didn't catch).
- Use a debugger.
`Spyder`

has several debugging tools. You can set checkpoints (where you want your code to stop) and examine the values of variables. But you need to load your program into`Spyder`

and run it from there. - If you want to run your script in the Python interpreter and have it leave all the variables available after a run:
`>>> exec(open("myscript.py").read(), globals())`

To wrap this all up, we're going to leave the notebook and get you to write a stand-alone script that can be run from the command-line. The script should:

- get the name of a file from command-line argument
- open the file and read its contents (columns of numbers)
- convert columns in the file into
`numpy`

arrays, - compute and report the mean of each column.
Try to use the concepts we've covered in this tutorial (e.g. make a function that reads columns and converts to arrays). Run the script on the file we created earlier (
`a_test_file`

).

Carnegie python links:

Python "experts" at Carnegie

- Shannon Patel (patel@carnegiescience.edu)
- Chris Burns (cburns@carnegiescience.edu)
- Eduardo Banados (#205)

Google. Chances are someone else had the same question you do, asked it, and had it answered on

`stackoverflow`

.