Numerical Programming: Fundamental matrix and array processing capabilities are provided by the excellent NumPy
library. NumPy provides the basic array data type plus some simple processing operations
In [1]:
import numpy as np # load the library
a = np.linspace(-np.pi, np.pi, 100) # Create array
b = np.cos(a) # Apply cosine to each element of a
c = np.ones(25) # An array of 25 ones
np.dot(c,c) # Compute inner product
Out[1]:
The SciPy library is built on top of NumPy and provides additional functionality.
In [2]:
from scipy.stats import norm
from scipy.integrate import quad
phi = norm()
value, error = quad(phi.pdf, -2,2) # Integrate using Gaussian quadrature
value
Out[2]:
SciPy includes many of the standard routines used in:
The most popular and comprehensive Python library for creating figures and graphs is Matplotlib
.
To see more plots you can see here
Other graphics libraries include:
Symbolic Algebra, it's useful to be able to manipulate symbolic expressions, as in Mathematica or Maple. The SymPy library provides this functionality from within the Python shell
In [3]:
from sympy import Symbol
x, y = Symbol('x'), Symbol('y') # Treat 'x' and 'y' as algebraic symbols
print(x + x + x + y)
In [4]:
expression = (x+y)**2
expression.expand()
Out[4]:
In [5]:
from sympy import solve
# Solving polynomials
solve(x**2 + x + 2)
Out[5]:
In [6]:
from sympy import limit, sin, diff
limit(1 / x, x, 0)
Out[6]:
In [7]:
limit(sin(x)/x , x, 0)
Out[7]:
In [8]:
diff(sin(x),x)
Out[8]:
Python’s data manipulation and statistics libraries have improved rapidly over the last few years. Pandas
one of the most popular libraries for working with data. Pandas is fast, efficient, and well designed.
In [9]:
import pandas as pd
import scipy as sp
data = sp.randn(5, 2) # Create 5x2 matrix of random numbers
dates = pd.date_range('28/12/2010', periods=5)
df = pd.DataFrame(data, columns=('price', 'weight'), index=dates)
print(df)
In [10]:
df.mean()
Out[10]:
Other Useful Statistics Libraries:
statsmodel
- various statistical routinesscikit-learn
- machine learning in PythonpyMC
- for bayesian data analysispystan
- Bayesian analysis based on stanPython has many libraries for studying graphs, One well-known example is NetworkX
Here’s some example code that generates and plots a random graph, with node color determined by shortest path length from a central node
In [11]:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
G = nx.random_geometric_graph(200,0.12) # Generate random graph
pos = nx.get_node_attributes(G, 'pos') # Get positions of nodes
# Find node nearest the center point (0.5, 0.5)
dists = [(x - 0.5)**2 + (y - 0.5)**2 for x,y in list(pos.values())]
ncenter = np.argmin(dists)
In [12]:
# Plot graph, coloring by path length from central node
p = nx.single_source_shortest_path_length(G, ncenter)
plt.figure()
nx.draw_networkx_edges(G, pos, alpha=0.4)
nx.draw_networkx_nodes(G, pos, nodelist=list(p.keys()), node_size=120, alpha=0.5,
node_color=list(p.values()), cmap=plt.cm.jet_r)
plt.show()
Running your Python code on massive servers in the cloud is becoming easier and easier.
A nice example is Wakari
In [ ]: