Slides
princetonpy.com/courseSession
Aims :
Not everything said today will interest everyone. Not all aspects of "scientific computing" interest everybody.
If you're interested in something specific, please ask.
It has a thriving community of developers, especially in science.
The Zen of Python
C, C++, and Fortran
These languages are very fast, and great for heavy computations. However, they're slow and painful to write - there's no interactivity, the syntax gets complicated, and have manual memory management.
R
A tool for advanced statistics, but the language is exactly that : a tool aimed at stats. It's not very good for general-purpose coding. I have a strange aversion to it. Hey, at least it's free and open-source.
Matlab and Octave
Matlab has a great development environment, and a huge number of optimised, implemented toolsets. It's very expensive though. Octave is a great free clone, but it's not as pleasant to use.
So, Python ?
Huge range of scientific tools - nonlinear function fitting, MCMC, spectral analysis, ODEs and PDEs, signal and image processing, great data science tools. Vast community, active development, and very high quality due to the way the language is developed and the way we code it. It's batteries-included. Downsides are that the IDE isn't as shiny as Matlab, but I'm well over it - IPython Notebooks ( we'll see later ) are awesome.
Python's standard library is huge. Still, as scientists, we require some fairly specific things that pure programmers might not immediately need : reasonable vector notation and manipulation, matrix and linear algebra, optimisation, interpolation, random numbers and statistical functions, plotting, etc.. The "standard Python stack" puts together a few modules and extensions to Python to give us these.
Also very awesome, not covered today :
Then, you may need more specific tools, like a good MCMC sampler ( PyMC ), or constrained, nonlinear function fitting ( lmfit ), or machine learning algorithms ( scikit-learn ), image processing ( scikit-image ), etc.. The list goes on !
In this workshop, we'll get set up with the basic Python stack and explore how they work. Then we'll demo some other packages for fun.
I find that Anaconda is a great distribution. It comes will most of the packages you'll need, and a great command-line package manager to help keep them up to date and install others. If you want to grab the faster distro ( free for academics ), head to store.continuum.io/cshop/academicanaconda and register with an @blah.edu address to grab an academic license. If you can't be bothered with that, it's at continuum.com/downloads.
If you don't want Anaconda, you can install Python on its own, and then add packages and modules as you need them. I won't be covering this in the interest of time. If you're under Linux, use your package manager; if you're under OSX, then HomeBrew has what you need. If you're running Windows and are interested in installing everything yourself, cry a little and then head to python.org/getit.
A note on what we're installing : Anaconda comes with Python 2.7. This is by far the most widespread version of Python. It goes by Python 2 for short. There has been, for years, a Python 3, but it isn't backwards compatible, and whilst many of the main scientific packages are working to fix that, then the vast majority of scientists use Python 2 ( actually, just about everyone : the Python dudes themselves recommend starting with 2.7, due to compatibility; this is changing however, as more packages move towards Py3 compatibility ).
Front-End / Interface
Here, we have several options. With Python now installed, you just need an environment to write it in. You could use a standard text editor. I recommend Atom, for all platforms.
Then, you could go with something more advanced, like an IDE ( integrated development environment ). An IDE is a one-stop shop to write and run your code. For Python, I like Spyder. It's available for all three of the above OSs. If you've installed Anaconda, you already have Spyder. Linux and OSX users, call spyder from the command line. Windows users can call the Anaconda Launcher, and you'll have it there.
Finally, there's my favourite way to write Python for development : the IPython Notebook. If you installed Anaconda, you already have IPython. Otherwise, go get it, you won't regret it. The IPython Notebook concept will be familiar to you if you've used Mathematica, and some aspects of Matlab ( though in Matlab, it's not done so well ). You have cells in which you write code, and you can execute cells independently. With a quick command-line hack, you can even get plots inline, such that all plots show up under the relevant cells. To call IPython Notebooks, OSX and Linux users can just call ipython notebook from the command line, and Windows users with Anaconda have an IPython Notebook shortcut in their Start Menu ( in theory ). For inline plots and sexiness all around, I prefer calling ipython notebook --script. Here, --script tells IPython to also save a .py as well as the .ipynb extension, so you can just run your code from the command line or on a remote computer if you want to. Linux and OSX users can write this as an alias if they want : drop the line
alias pynb='ipython notebook --script
in ~/.bashrc, and Windows users can edit the shortcut to their IPython Notebook to get the same result.
Take the time to consider your workflow and select an option.
We'll be doing a few exercises throughout.