About Python

  • Python is a general purpose programming language
  • Python is free and open source
  • Python is supported by a vast collection of standart and external software libraries
  • Python is one of the most popular programming languages
  • Python stepped in almost any field such as:
    • Game programming
    • CGI and GUI programming
    • Web development
    • Scientific programming
    • Data Science
    • Machine Learning
    • Communications
  • Python is used extensively by Internet service and high tech companies:
    • Google
    • Dropbox
    • Reddit
    • Youtube
    • Walt Disney Animation
  • Python is high level language suitable for rapid development
  • Supported with many libraries community built
  • Python is multiparadigm language (procedural, object-oriented, functional programming)
  • Python is interpreted language rather than compiled

Scientific Python

  • Python has become one of the core languages of scientific computing.

Numerical Programming: Fundamental matrix and array processing capabilities are provided by the excellent NumPy library. NumPy provides the basic array data type plus some simple processing operations


In [1]:
import numpy as np # load the library
a = np.linspace(-np.pi, np.pi, 100) # Create array 
b = np.cos(a) # Apply cosine to each element of a
c = np.ones(25) # An array of 25 ones
np.dot(c,c) # Compute inner product


Out[1]:
25.0

The SciPy library is built on top of NumPy and provides additional functionality.


In [2]:
from scipy.stats import norm
from scipy.integrate import quad
phi = norm()
value, error = quad(phi.pdf, -2,2) # Integrate using Gaussian quadrature
value


Out[2]:
0.9544997361036417

SciPy includes many of the standard routines used in:

  • linear algebra
  • integration
  • interpolation
  • optimization
  • distributios and random number generation
  • signal processing

The most popular and comprehensive Python library for creating figures and graphs is Matplotlib.

  • Plots, histograms, contour images, 3D, bar charts, etc., etc.
  • Output in many formats (PDF, PNG, EPS, etc.)
  • LaTeX integration

To see more plots you can see here

Other graphics libraries include:

  • Plotly
  • Bokeh
  • VPython -- 3D graphics and animations

Symbolic Algebra, it's useful to be able to manipulate symbolic expressions, as in Mathematica or Maple. The SymPy library provides this functionality from within the Python shell


In [3]:
from sympy import Symbol
x, y = Symbol('x'), Symbol('y') # Treat 'x' and 'y' as algebraic symbols
print(x + x + x + y)


3*x + y

In [4]:
expression = (x+y)**2
expression.expand()


Out[4]:
x**2 + 2*x*y + y**2

In [5]:
from sympy import solve
# Solving polynomials
solve(x**2 + x + 2)


Out[5]:
[-1/2 - sqrt(7)*I/2, -1/2 + sqrt(7)*I/2]

In [6]:
from sympy import limit, sin, diff
limit(1 / x, x, 0)


Out[6]:
oo

In [7]:
limit(sin(x)/x , x, 0)


Out[7]:
1

In [8]:
diff(sin(x),x)


Out[8]:
cos(x)

Python’s data manipulation and statistics libraries have improved rapidly over the last few years. Pandas one of the most popular libraries for working with data. Pandas is fast, efficient, and well designed.


In [9]:
import pandas as pd
import scipy as sp
data = sp.randn(5, 2) # Create 5x2 matrix of random numbers
dates = pd.date_range('28/12/2010', periods=5)
df = pd.DataFrame(data, columns=('price', 'weight'), index=dates)
print(df)


               price    weight
2010-12-28 -1.701924  0.935324
2010-12-29 -1.028632 -2.053585
2010-12-30 -0.697502  0.331397
2010-12-31  0.250651 -0.246060
2011-01-01  0.496469 -1.150075

In [10]:
df.mean()


Out[10]:
price    -0.536187
weight   -0.436600
dtype: float64

Other Useful Statistics Libraries:

  • statsmodel - various statistical routines
  • scikit-learn - machine learning in Python
  • pyMC - for bayesian data analysis
  • pystan - Bayesian analysis based on stan

Python has many libraries for studying graphs, One well-known example is NetworkX

Here’s some example code that generates and plots a random graph, with node color determined by shortest path length from a central node


In [11]:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np

G = nx.random_geometric_graph(200,0.12) # Generate random graph
pos = nx.get_node_attributes(G, 'pos') # Get positions of nodes
# Find node nearest the center point (0.5, 0.5)
dists = [(x - 0.5)**2 + (y - 0.5)**2 for x,y in list(pos.values())]
ncenter = np.argmin(dists)


/Users/Alesh/anaconda3/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
/Users/Alesh/anaconda3/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

In [12]:
# Plot graph, coloring by path length from central node
p = nx.single_source_shortest_path_length(G, ncenter)
plt.figure()
nx.draw_networkx_edges(G, pos, alpha=0.4)
nx.draw_networkx_nodes(G, pos, nodelist=list(p.keys()), node_size=120, alpha=0.5,
                          node_color=list(p.values()), cmap=plt.cm.jet_r)

plt.show()


Running your Python code on massive servers in the cloud is becoming easier and easier.

A nice example is Wakari


In [ ]: