PyShop Session 2

Exercises

These exercises are meant to help you to get to know the big four packages: NumPy, SciPy, matplotlib, and Pandas. Given there is much to cover, the problems will be fairly simple, but more complex applications will be covered in later courses.

The questions are in increasing difficulty, where the first question should take you less than a minute and the last one you might not be able to figure out. Good luck!

Note: I appologize for the solutions and questions not being next to each other, but there is a numbering issue in the Markdown that generates this text. Sorry, but it is a known bug that has yet to be fixed!

NumPy and SciPy Questions

These questions will be built around solving the following problem based on the static version of "Productivity Losses from Financial Frictions: Can Self-Financing Undo Capital Misallocation?" (AER, 2014) by Benjamin Moll:

In an economy populated by heterogeneous entrepreneurs, indexed by their productivity $z$, who face a borrowing constraint and have access to constant returns to scale production, it can be shown that the capital decision is bang bang. That is, agents either produce or don't, and if they produce they borrow up to their constraint.

If we define $g(a, z)$ the joint distribution of wealth and productivity, it can be shown that the equilibrium aggregat productivity satisfies

$$ \begin{align} Z &= \left( \frac{\int_\underline{z}^\infty z \omega(z)dz}{1 - \Omega(\underline{z})} \right)^\alpha \\ &\lambda(1 - \Omega(\underline{z})) = 1\\ &\Omega(z) = \int_0^z\omega(x)dx \end{align} $$

Where $\lambda$ is the tightness of the borrowing constraint, $\underline{z}$ is the lower bound of productivity for producers, $\alpha$ is the elasticity of substitution of capital for labor in the cobb-douglas production function, and $\omega(z)$ is the wealth share of agents with productivity $z$.

Create a numpy array representing a discretization of the distribution of wealth shares. You are free to choose its characteristics, but it must contain only positive entries and integrate to one. More explicitly, create a vector containing values of $\omega(z)$ over the support of $z$.
Since $\omega(z)$ is defined as a distribution, define its support as a vector on $\mathbb{R}_+$. Again, you're free to choose this vectors characteristics, as long as it is the same length as $\omega$.
Define a function that will calculate the integral $\Omega(z)$ using Simpson's rule.
Use Python to solve numerically for the value of $\underline{z}$, conditional on a choice of $\lambda$ and $\omega$. Note: The second equation pins down $\underline{z}$. $\lambda$ must be greater than 1.

Keep in mind that numerical methods are often unstable. You will find different convergence properties for different methods and values of lambda, so I encourage you to study the results and try different combinations
We won't study the first equation, as the upper bound can be problematic. However, as a final test, generate results for the level $\underline{z}$ for different values of $\lambda$ and plot the results inline.
Food for thought: Change the distribution of productivity around. You'll notice that as lambda rises, the lower bound of productivities for producers collapses to the upper bound of the assumed distribution. What does this say about the relationship between financial market completeness and productivity?

matplotlib questions

This question is based on Mike Hewner's page about "Making Simple Fractals in R".

The "King's Dream Fractal" was created by a science fiction author (who also happens to have a PhD in Molecular Biophysics and Biochemistry from Yale). The formula for the points is recursive:

$$ \begin{align} x_{n+1} &= \sin(y_{n}b) + c\sin(x_{n}b)\\ y_{n+1} &= \sin(x_{n}a) + d\sin(y_{n}a) \end{align} $$

First, write a function that takes in a number, $N$, and a vector of constants $(a, b, c, d)$, returning $N$ points in the fractal. Try $N = 10,000$ points (NOTE: $100,000$ points causes overflow for me...). Assume an initial condition $(x_0, y_0) = (0.1, 0.1)$.
Try directly plotting the points. What do you get? What is the problem? How would you suggest fixing this?
Try several different methods to make the plot more legible. For more on line properties, visit http://matplotlib.org/users/pyplot_tutorial.html#controlling-line-properties.
Add title, x and y labels, and a legend to your plot. More partiularly, use latex to define your labels so they look nice.
Finally, overlay the system of equations in a legend. (Hint: if you're struggling, keep in mind you need to give legend a "handle".)

Pandas Questions

This question is based on some work I'm doing (it's easy because I already have the cleaned up data).

Import the data found in the cex.csv file accompanying these exercises into a DataFrame.

This data set contains observations for households over several quarters, including identifiers, quarter of interview, year of interview, etc.
The variable 'newid' is a combination of a unique consumer unit identifier and the interview number, where the last digit is the interview number. Split this number and save it as two new variables called 'cu_id' and 'interview_num'.
Define a heirarchical index based on the two variables you just created.
Use the get_values method to retrieve the column 'cuid'. Set this as an array called cuid. Note: this would be much easier if you'd copied it before setting the index, or, if you want to get fancy, if you reset the index, but this is good practice.
Delete all of the non-unique entrieds in cuid
Generate a random data frame by sampling from cuid without replacement and generating some new variable, e.g. 'ice_cream_consumed'.
Generate two new data frames by adding this column to your data set in different ways: use join, use merge. Hint: If you run into problems, think about data types. Keep in mind that heirarchical indexing makes adding a column impossible. Finally, try sorting your DataFrames. Python often struggles with unsorted objects.