Overview Scientific Packages for Python

  • Package can be seen as a container of variables and functions provided by others to help us accomplish our tasks

Import Packages into our enviroment

  • To use a package, the first step is to import them into our environment. There are several ways of doing so in Python.

In [ ]:
import math

In [ ]:
math.factorial(5)  # functions in math package

In [ ]:
math.e  # variables in math package
  • To reduce the name space, we can specify another name to refer to the original package

In [ ]:
import math as m

In [ ]:
m.factorial(5)

In [ ]:
m.e
  • Import some variables or functions into the enviroment
  • Then we can use variables or functions in the enviroment directly

In [ ]:
from math import factorial, e

In [ ]:
factorial(5)

In [ ]:
e
  • Import all the variables and functions into the enviroment at once

In [ ]:
from math import *

In [ ]:
factorial(10)

In [ ]:
e

List of Packages for Scientific Computations and Data Analysis

  • Numpy stands for Numerical Python, is the foundational package for scientific computing in Python. It provides
    • A fast and efficient multidimensional arrary object ndarray
    • Functions for performing element-wise computation with arrays and mathematical operations between arrays
    • Linear algebra operations, Fourier transform, and random number generation
  • matplotlib for producing plots and other 2D data visualizations. It is well-suited for creating plots suitable for publication.
  • SciPy is a collection of packages addressing a number of different standard problem domain in scientific computing. Here is some packages you may use:
    • scipy.integrate: numerical integration routines and differential equation solvers
    • scipy.linalg: linear algebra routines and matrix decompositions
    • scipy.optimize: function optimizers and root finding algorithms
    • scipy.stats: standard probability distributions (cdf, pdf, sampler and etc)
  • pandas provides rich operations and manipulations to make working with structured data fast, easy, and expressive. It is the critical ingredients enabling Python to be a powerfull and productive data analysis environment.
  • Scikit Learn is built on Numpy, SciPy and matplotlib for machine learning. This library contains a lot of effecient tools for machine learning and statistical modeling including classification, regression, clustering.
  • Statsmodels is a package for statistical modeling that allows users to explore data, estimate statistical models and perform statistical tests.
  • Blaze for extending the capability of Numpy and Pandas to distributed and streaming datasets.