Numpy and Pandas

Numpy

NumPy is the fundamental package for scientific computing with Python.

http://www.numpy.org/

It contains among other things:

a powerful N-dimensional array object
useful linear algebra, Fourier transform, and random number capabilities
tools for integrating C/C++ and Fortran code



In [ ]:

    
!conda install --yes --c conda-forge ipywidgets numpy pandas nomkl seaborn ipywidgets jupyter









    



Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata ...



In [1]:

    
import numpy



In [2]:

    
numpy.ones((2, 3))









    Out[2]:





array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])



In [3]:

    
a = numpy.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
a









    Out[3]:





array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])



In [4]:

    
a.shape









    Out[4]:





(4, 3)



In [5]:

    
a.ndim









    Out[5]:





2



In [6]:

    
a.size









    Out[6]:





12



In [7]:

    
a - numpy.random.random(a.shape)









    Out[7]:





array([[  0.37772541,   1.4749529 ,   2.40223093],
       [  3.69957528,   4.84057082,   5.01177475],
       [  6.20454538,   7.2169713 ,   8.82870779],
       [  9.38959571,  10.0096205 ,  11.93880162]])



In [8]:

    
a.ravel()









    Out[8]:





array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])



In [9]:

    
a









    Out[9]:





array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])



In [10]:

    
a[1:-1]









    Out[10]:





array([[4, 5, 6],
       [7, 8, 9]])



In [11]:

    
'>qwweqwe<'[1:-1]









    Out[11]:





'qwweqwe'



In [12]:

    
a









    Out[12]:





array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])



In [13]:

    
a[:,1]









    Out[13]:





array([ 2,  5,  8, 11])



In [14]:

    
a[a % 2 == 0] = -1



In [15]:

    
a









    Out[15]:





array([[ 1, -1,  3],
       [-1,  5, -1],
       [ 7, -1,  9],
       [-1, 11, -1]])

Pandas

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

http://pandas.pydata.org/

It features:

A fast and efficient DataFrame object for data manipulation with integrated indexing;
Tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format;
Intelligent label-based slicing, fancy indexing, and subsetting of large data sets;
Aggregating or transforming data with a powerful group by engine allowing split-apply-combine operations on data sets.



In [16]:

    
import pandas



In [17]:

    
boston_dataset = pandas.read_csv("../static/Boston.csv")



In [18]:

    
boston_dataset[:10]



In [19]:

    
boston_dataset[boston_dataset['MV'] < 7]



In [20]:

    
boston_dataset['TARGET'] = boston_dataset['MV'].astype(int)



In [21]:

    
boston_dataset[:10]



In [22]:

    
%pylab inline

import seaborn
seaborn.set_context('talk')









    



Populating the interactive namespace from numpy and matplotlib

Plot a histogram with 50 bins



In [23]:

    
boston_dataset['MV'].hist(bins=50);

Jupyter Notebook interactive features



In [24]:

    
def plot_by(dataset, column='MV', bins_count=10):
    plot = boston_dataset[column].hist(bins=bins_count)
    
    # Plot settings.
    pyplot.title('%s Values' % column)
    pyplot.ylabel('N')

from ipywidgets import interact, fixed
interact(
    plot_by,
    dataset=fixed(boston_dataset),
    column=boston_dataset.columns.tolist(),
    bins_count=(5,50)
);



In [ ]:

	CRIM	ZN	INDUS	NOX	RM	AGE	DIS	RAD	TAX	PT	B	LSTAT	MV
0	0.00632	18.0	2.31	0.538	6.575	65.2	4.0900	1	296	15.3	396.90	4.98	24.0
1	0.02731	0.0	7.07	0.469	6.421	78.9	4.9671	2	242	17.8	396.90	9.14	21.6
2	0.02729	0.0	7.07	0.469	7.185	61.1	4.9671	2	242	17.8	392.83	4.03	34.7
3	0.03237	0.0	2.18	0.458	6.998	45.8	6.0622	3	222	18.7	394.63	2.94	33.4
4	0.06905	0.0	2.18	0.458	7.147	54.2	6.0622	3	222	18.7	396.90	5.33	36.2
5	0.02985	0.0	2.18	0.458	6.430	58.7	6.0622	3	222	18.7	394.12	5.21	28.7
6	0.08829	12.5	7.87	0.524	6.012	66.6	5.5605	5	311	15.2	395.60	12.43	22.9
7	0.14455	12.5	7.87	0.524	6.172	96.1	5.9505	5	311	15.2	396.90	19.15	27.1
8	0.21124	12.5	7.87	0.524	5.631	100.0	6.0821	5	311	15.2	386.63	29.93	16.5
9	0.17004	12.5	7.87	0.524	6.004	85.9	6.5921	5	311	15.2	386.71	17.10	18.9

	CRIM	INDUS	NOX	RM	AGE	DIS	RAD	TAX	PT	B	LSTAT	MV
398	38.35180	18.1	0.693	5.453	100.0	1.4896	24	666	20.2	396.90	30.59	5.0
399	9.91655	18.1	0.693	5.852	77.8	1.5004	24	666	20.2	338.16	29.97	6.3
400	25.04610	18.1	0.693	5.987	100.0	1.5888	24	666	20.2	396.90	26.77	5.6
405	67.92080	18.1	0.693	5.683	100.0	1.4254	24	666	20.2	384.97	22.98	5.0

	CRIM	ZN	INDUS	NOX	RM	AGE	DIS	RAD	TAX	PT	B	LSTAT	MV	TARGET
0	0.00632	18.0	2.31	0.538	6.575	65.2	4.0900	1	296	15.3	396.90	4.98	24.0	24
1	0.02731	0.0	7.07	0.469	6.421	78.9	4.9671	2	242	17.8	396.90	9.14	21.6	21
2	0.02729	0.0	7.07	0.469	7.185	61.1	4.9671	2	242	17.8	392.83	4.03	34.7	34
3	0.03237	0.0	2.18	0.458	6.998	45.8	6.0622	3	222	18.7	394.63	2.94	33.4	33
4	0.06905	0.0	2.18	0.458	7.147	54.2	6.0622	3	222	18.7	396.90	5.33	36.2	36
5	0.02985	0.0	2.18	0.458	6.430	58.7	6.0622	3	222	18.7	394.12	5.21	28.7	28
6	0.08829	12.5	7.87	0.524	6.012	66.6	5.5605	5	311	15.2	395.60	12.43	22.9	22
7	0.14455	12.5	7.87	0.524	6.172	96.1	5.9505	5	311	15.2	396.90	19.15	27.1	27
8	0.21124	12.5	7.87	0.524	5.631	100.0	6.0821	5	311	15.2	386.63	29.93	16.5	16
9	0.17004	12.5	7.87	0.524	6.004	85.9	6.5921	5	311	15.2	386.71	17.10	18.9	18