This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com) for UW's [Astro 599](http://www.astro.washington.edu/users/vanderplas/Astr599/) course. Source and license info is on [GitHub](https://github.com/jakevdp/2013_fall_ASTR599/).

Welcome to the Python Boot Camp!!

Much of this material thanks to http://www.pythonbootcamp.info/

Objectives

  • Introduce you to the Python language
  • Get you writing Python code.
  • Convince you of Python's utility in your research life
  • Encourage good coding and data-management practices
  • Don't proselytize Python... too much

Organization

  • Thursday-Friday, 9:00-12:00 & 1:00-4:00
  • Broken into 1-hour "modules"
  • Short breakout coding sessions after each module
  • Actual coding $\to$ actual learning

Connecting to Me

Jake Vanderplas

1. What is Python?

2. Why should a scientist use Python?

3. Getting started: four ways to use Python

What is Python?

Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together. Python's simple, easy to learn syntax emphasizes readability and therefore reduces the cost of program maintenance. Python supports modules and packages, which encourages program modularity and code reuse. The Python interpreter and the extensive standard library are available in source or binary form without charge for all major platforms, and can be freely distributed.

http://www.python.org/doc/essays/blurb/

What is Python

**Interpreted**No need for a compiling stage
**Object-oriented**Objects: complex data structures with attributes and methods
**High-level**Abstraction from the way the machine actually executes
**Dynamic**Variables can change meaning on-the-fly
**Built-in**Fewer external requirements
**Data structures**Ways of storing/manipulating data
**Script/Glue**Code that controls other programs
**Typing**The kind of variable (int, string, float)
**Syntax**Grammar which defines the language
**Library**reusable collection of code

History of Python

  • Started over Christmas break 1989, by Guido van Rossum
  • Development in the Early '90s
  • Guido is the "BDFL": the Benevolent Dictator for Life

"...in December 1989, I was looking for a 'hobby' programming project that would keep me occupied during the week around Christmas. My office ... would be closed, but I had a home computer, and not much else on my hands. I decided to write an interpreter for the new scripting language I had been thinking about lately: a descendant of ABC that would appeal to Unix/C hackers. I chose Python as a working title for the project, being in a slightly irreverent mood (and a big fan of Monty Python's Flying Circus)."

Aside... the glory of Monte Python

History of Python

  • Open-sourced development from the start (currently BSD-licensed)

  • Large development community

  • Version 2.0 (2000), 2.6 (2008), 2.7 (2010)

    • We'll use version 2.7.5 for this course
  • Version 3.X (2008) is not backward compatible

    • But 2.7 code can be migrated to 3.X relatively easily

Why should a Scientist use Python?

C, C++, Fortran</h3>

Pros:

  • Great performance; lots of legacy scientific computing codes

Cons:

  • Syntax not optimized for casual programming
  • No interactive facilities
  • Difficult Visualization, text processing, etc.

Why should a Scientist use Python?

IDL, Matlab, Mathematica

Pros:

  • Interactive with easy visualization tools
  • Extensive scientific libraries

Cons:

  • Costly and Proprietary
  • Unpleasant for large-scale computing & non-mathematical tasks

Why should a Scientist use Python?

  • Python is free (BSD license) and highly portable (Linux, Mac OSX, Windows, etc.)
  • Interactive interpreter
  • Extremely readable syntax
  • Simple: non-professional programmers can use it effectively
    • great documentation
    • memory management is taken care of

Why should a Scientist use Python? (cont')

  • Clean (and optional) object-oriented model
  • Rich collection of built-in types, from simple to compound
  • Comprehensive standard library
  • Well-established 3rd-party packages (NumPy, SciPy, Matplotlib)
  • Easy wrapping of legacy code (C, C++, Fortran)

Why should a Scientist use Python? (cont')

  • Python mastery is a marketable skill-set.
  • By comparison, virtually no industries use IDL
    • (sorry...)

Why should a Scientist use Python? (cont')

Amazingly Scalable

  • Allows interactive experimentation
  • Code can be one-line scripts or million-line projects
  • Used by novices and full-time professionals alike

The Kitchen Sink

  • Python can do anything you want with impressive simplicity

Performance when you need it

  • As an interpreted language, Python can be slow
  • But there are good options to get around this (as we'll see)

LSST Group uses it for...

Data reduction & Analysis

Wrapping of LSST C++ Pipeline

Quick Calculations

Building websites

Quick Visualization

Plots for Publications

IPython Notebook in Action:


In [7]:
%pylab inline

N = 200
r = 2 * rand(N)
theta = 2 * pi * rand(N)
area = 200 * r ** 2 * rand(N)
ax = plt.subplot(111, polar=True)
scatter(theta, r, c=theta, s=area, alpha=0.75)


Populating the interactive namespace from numpy and matplotlib
Out[7]:
<matplotlib.collections.PathCollection at 0x102f44510>

IPython Notebook:

Gaining major acceptance as a computing & demo platform


In [11]:
from IPython.display import YouTubeVideo
YouTubeVideo('tlontoyWX70', start=30)


Out[11]:

Visualization:


In [22]:
!curl -O http://www.astroml.org/_downloads/fig_moving_objects_multicolor.py
%run fig_moving_objects_multicolor.py


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3747  100  3747    0     0   7845      0 --:--:-- --:--:-- --:--:-- 27152

Time to Get Started...

but first an interlude

Getting Started: Four Ways to Use Python

1. The Python command-line interpreter

2. Editing Python (.py) files

3. The IPython command-line interpreter

4. The IPython notebook

1. The Python command-line Interpreter

If you have never used the command-line, you're in for a treat

  • Mac OSX: in Finder/Applications, search for "Terminal"

  • Linux/Unix: Ctrl-Alt-t

  • Windows: run "cmd"

1. The Python command-line Interpreter

Type python at the command-line to start the interpreter

1. The Python Command-line Interpreter

Execute a command: Type print "hello world"

1. The Python Command-line Interpreter

Closing the terminal:

  • Either type exit() or type Ctrl-d

2. Editing Python (.py) files

This requires a text editor.

The best option is one which includes code highlighting

  • Linux: gedit, emacs, nano, vim...
  • Mac OSX: textmate, emacs, nano, vim...
  • Windows: NotePad...

GUI-based editors with bells & whistles

  • Linux: KWrite, Scribes, eggy
  • Mac OSX: TextWrangler, SublimeText
  • Windows: NotePad++, SublimeText

2. Editing Python (.py) files

Use your editor to open hello_world.py (here we use OSX's mate)

Edit the file to say print "hello world"

In the terminal, run python hello_world.py

3. The IPython command-line Interpreter

IPython provides an enhanced command-line interface

It can be started by typing ipython:

Useful features include tab completion, help (?), etc.

3. The IPython command-line Interpreter

Basic use is just like the standard interpreter:

4. The IPython Notebook

The IPython notebook can be started by typing ipython notebook:

4. The IPython Notebook

Your web browser should open to an interactive notebook page

Notice that this slideshow is written as an IPython notebook!

Conclusions

  • We've introduced Python and motivated its use in science

  • We've talked about four ways to use Python

  • Next up: basic training - getting started in Python