Python for Scientific Computing in Economics

... background material available at https://github.com/softecon/talks

Why Python?

general-purpose	widely used
high-level	readability
extensibility	active community
numerous interfaces



In [1]:

    
import this









    



The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Why Python for Scientific Computing?

Python is used by computer programmers and scientists alike. Thus, tools from software engineering are readily available.
Python is an open-source project. This ensures that all implementation details can be critically examined. There are no licence costs and low barriers to recomputability.
Python can be easily linked to high-performance languages such as C and Fortran. Python is ideal for prototyping with a focus on readability, design patterns, and ease of testing.
Python has numerous high-quality libraries for scientific computing under active development.

What do you need to get started?

SciPy Stack	Basic Example
Integrated Development Environment	Additional Resources

First things first, here is the ``Hello, World!'' program in Python.



In [2]:

    
print("Hello, World!")









    



Hello, World!

SciPy Stack

Most of our required tools are part of the SciPy Stack, a collection of open source software for scientific computing in Python.

SciPy Library	NumPy
Matplotlib	pandas
SymPy	IPython
nose

Depending on your particular specialization, this package might be of additional interest to you, e.g. statsmodels.

Basic Example

To get a feel for the language, let us work with a basic example. We will set up a simple Ordinary Least Squares (OLS) model.

$$Y=Xβ+ϵ$$

We start by simulating a synthetic dataset. Then we fit a basic OLS regression and assess the quality of its prediction.

Alternatives

Terminal
Jupyter Notebook

Pseudorandom Number Generation



In [3]:

    
# Import relevant libraries from the SciPy Stack
import numpy as np

# Specify parametrization
num_agents = 1000
num_covars = 3

betas_true = np.array([0.22, 0.30, -0.1]).T

# Set a seed to ensure recomputability in light of randomness
np.random.seed(4292367295)

# Sample exogenous agent characteristics from a uniform distribution in 
# a given shape
X = np.random.rand(num_agents, num_covars)

# Sample random disturbances from a standard normal distribution and rescale
eps = np.random.normal(scale=0.1, size=num_agents)

# Construct endogenous agent characteristic
Y = np.dot(X, betas_true) + eps

Statistical Analysis



In [4]:

    
# Import relevant libraries from the SciPy Stack
import statsmodels.api as sm

# Specify and fit the model
rslt = sm.OLS(Y, X).fit()



In [5]:

    
# Provide some summary information
print(rslt.summary())









    



                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.835
Model:                            OLS   Adj. R-squared:                  0.834
Method:                 Least Squares   F-statistic:                     1682.
Date:                Wed, 03 Feb 2016   Prob (F-statistic):               0.00
Time:                        17:49:15   Log-Likelihood:                 864.68
No. Observations:                1000   AIC:                            -1723.
Df Residuals:                     997   BIC:                            -1709.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1             0.2281      0.009     24.624      0.000         0.210     0.246
x2             0.2841      0.009     30.353      0.000         0.266     0.302
x3            -0.0898      0.009     -9.484      0.000        -0.108    -0.071
==============================================================================
Omnibus:                        7.301   Durbin-Watson:                   1.964
Prob(Omnibus):                  0.026   Jarque-Bera (JB):                7.417
Skew:                           0.202   Prob(JB):                       0.0245
Kurtosis:                       2.877   Cond. No.                         3.22
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Data Visualization



In [6]:

    
# Import relevant libraries from the SciPy Stack
import matplotlib.pyplot as plt

# Initialize canvas
ax = plt.figure(figsize=(12, 8)).add_subplot(111, axisbg='white')

# Plot actual and fitted values
ax.plot(np.dot(X, rslt.params), Y, 'o', label='True')
ax.plot(np.dot(X, rslt.params), rslt.fittedvalues, 'r--.', label="Predicted")

# Set axis labels and ranges
ax.set_xlabel(r'$X\hat{\beta}$', fontsize=20)
ax.set_ylabel(r'$Y$', fontsize=20)

# Remove first element on y-axis
ax.yaxis.get_major_ticks()[0].set_visible(False)

# Add legend
plt.legend(loc='upper center', bbox_to_anchor=(0.50, -0.10),
    fancybox=False, frameon=False, shadow=False, ncol=2, fontsize=20)

# Add title
plt.suptitle('Synthetic Sample', fontsize=20)

# Save figure
plt.savefig('images/scatterplot.png', bbox_inches='tight', format='png')



In [7]:

    
from IPython.display import Image
Image(filename='images/scatterplot.png', width=700, height=700)









    Out[7]:

Integrated Development Environment

PyCharm

PyCharm is developed by the Czech company JetBrains. It is free to use for educational purposes. However, it is a commerical product and thus very well documented. Numerous resources are available to get you started.

If you would like to check out some alternatives: (1) Spyder, (2) PyDev.

Potential Benefits

Unit Testing Integration
Graphical Debugger
Version Control Integration
Coding Assistance
- Code Completion
- Syntax and Error Highlighting
- ...

Let us check it all out for our Basic Example.

Graphical User Interface

Conclusion

Next Steps

Set up your machine for scientific computing with Python
- Visit Continuum Analytics and download Anaconda for your own computer. Anaconda is a free Python distribution with all the required packages to get you started.
- Install PyCharm. Make sure to hook it up to your Anacadona distribution (instructions).
Check out the additional resources to dive more into the details.

Additional Resources

Gaël Varoquaux, Emmanuelle Gouillart, Olaf Vahtras (eds.). SciPy Lecture Notes, available at http://www.scipy-lectures.org.
Hans Petter Langtangen. A Primer on Scientific Programming with Python, Springer, New York, NY.
Thomas J. Sargent, John Stachurski (2016). Quantitative Economics. Online Lecture Notes.
Software Engineering for Economists Initiative, Online Resources.

Numerous additional lecture notes, tutorials, online courses, and books are available online.

Contact

Philipp Eisenhauer

Mail eisenhauer@policy-lab.org

Web http://eisenhauer.io

Repository https://github.com/peisenha

Software Engineering for Economists Initiative

Overview http://softecon.github.io

Repository https://github.com/softEcon



In [8]:

    
import urllib; from IPython.core.display import HTML
HTML(urllib.urlopen('http://bit.ly/1K5apRH').read())









    Out[8]: