NumPy

NumPy is a Linear Algebra Library for Python.

NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In NumPy dimensions are called axes. The number of axes is rank. For example, the coordinates of a point in 3D space [1, 2, 1] is an array of rank 1, because it has one axis. That axis has a length of 3. In the example pictured below, the array has rank 2 (it is 2-dimensional).

Numpy is also incredibly fast, as it has bindings to C libraries.

For easy installing Numpy:

sudo pip3 install numpy

NumPy array



In [1]:

    
import numpy as np 

a = [1,2,3]

a









    Out[1]:





[1, 2, 3]



In [2]:

    
b = np.array(a)
b









    Out[2]:





array([1, 2, 3])



In [3]:

    
np.arange(1, 10)









    Out[3]:





array([1, 2, 3, 4, 5, 6, 7, 8, 9])



In [4]:

    
np.arange(1, 10, 2)









    Out[4]:





array([1, 3, 5, 7, 9])

zeros , ones and eye

np.zeros

Return a new array of given shape and type, filled with zeros.



In [5]:

    
np.zeros(2, dtype=float)









    Out[5]:





array([ 0.,  0.])



In [5]:

    
np.zeros((2,3))









    Out[5]:





array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

ones

Return a new array of given shape and type, filled with ones.



In [7]:

    
np.ones(3, )









    Out[7]:





array([ 1.,  1.,  1.])

eye

Return a 2-D array with ones on the diagonal and zeros elsewhere.



In [8]:

    
np.eye(3)









    Out[8]:





array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

linspace

Returns num evenly spaced samples, calculated over the interval [start, stop].



In [9]:

    
np.linspace(1, 11, 3)









    Out[9]:





array([  1.,   6.,  11.])

Random number and matrix

rand

Random values in a given shape.



In [10]:

    
np.random.rand(2)









    Out[10]:





array([ 0.11632384,  0.36169799])



In [11]:

    
np.random.rand(2,3,4)









    Out[11]:





array([[[ 0.57504205,  0.77035938,  0.48356283,  0.43705212],
        [ 0.43280501,  0.28349802,  0.71859913,  0.70023941],
        [ 0.23944041,  0.38088474,  0.45425597,  0.1247087 ]],

       [[ 0.47605776,  0.02899426,  0.07707044,  0.72656565],
        [ 0.73339753,  0.87517107,  0.73552799,  0.2346255 ],
        [ 0.68990338,  0.57983998,  0.37863682,  0.03533712]]])

randn

Return a sample (or samples) from the "standard normal" distribution.

andom.standard_normal Similar, but takes a tuple as its argument.



In [12]:

    
np.random.randn(2,3)









    Out[12]:





array([[-1.6991798 , -0.61355368,  0.49392586],
       [ 0.89563615,  1.42702856,  0.97350729]])

random

Return random floats in the half-open interval [0.0, 1.0).



In [13]:

    
np.random.random()









    Out[13]:





0.6852553611099047

randint

Return n random integers (by default one integer) from low (inclusive) to high (exclusive).



In [14]:

    
np.random.randint(1,50,10)









    Out[14]:





array([12, 31, 31, 12,  2, 46, 24, 11, 47,  3])



In [15]:

    
np.random.randint(1,40)









    Out[15]:





24

Shape and Reshape

shape return the shape of data and reshape returns an array containing the same data with a new shape



In [16]:

    
zero = np.zeros([3,4])
print(zero , '   ' ,'shape of a :' , zero.shape)
zero = zero.reshape([2,6])
print()
print(zero)









    



[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]     shape of a : (3, 4)

[[ 0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.]]

Basic Operation

Element wise product and matrix product



In [17]:

    
number = np.array([[1,2,],
                   [3,4]])
number2 = np.array([[1,3],[2,1]])

print('element wise product :\n',number * number2 )
print('matrix product :\n',number.dot(number2))     ## also can use : np.dot(number, number2)









    



element wise product :
 [[1 6]
 [6 4]]
matrix product :
 [[ 5  5]
 [11 13]]

min max argmin argmax mean



In [18]:

    
numbers = np.random.randint(1,100, 10)
print(numbers)
print('max is :', numbers.max())
print('index of max :', numbers.argmax())
print('min is :', numbers.min())
print('index of min :', numbers.argmin())
print('mean :', numbers.mean())









    



[25 46 17 37 90 17 36 99 68 56]
max is : 99
index of max : 7
min is : 17
index of min : 2
mean : 49.1

Universal function

numpy also has some funtion for mathmatical operation like exp, log, sqrt, abs and etc .

for find more function click here



In [19]:

    
number = np.arange(1,10).reshape(3,3)
print(number)
print()
print('exp:\n', np.exp(number))
print()
print('sqrt:\n',np.sqrt(number))









    



[[1 2 3]
 [4 5 6]
 [7 8 9]]

exp:
 [[  2.71828183e+00   7.38905610e+00   2.00855369e+01]
 [  5.45981500e+01   1.48413159e+02   4.03428793e+02]
 [  1.09663316e+03   2.98095799e+03   8.10308393e+03]]

sqrt:
 [[ 1.          1.41421356  1.73205081]
 [ 2.          2.23606798  2.44948974]
 [ 2.64575131  2.82842712  3.        ]]

dtype



In [20]:

    
numbers.dtype









    Out[20]:





dtype('int64')

No copy & Shallow copy & Deep copy

### No copy ###### Simple assignments make no copy of array objects or of their data.



In [21]:

    
number = np.arange(0,20)
number2 = number 
print (number is number2 , id(number), id(number2))
print(number)
number2.shape = (4,5)
print(number)









    



True 139671397699872 139671397699872
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

### Shallow copy

Different array objects can share the same data. The view method creates a new array object that looks at the same data.



In [22]:

    
number = np.arange(0,20)
number2 = number.view()
print (number is number2 , id(number), id(number2))









    



False 139671397702032 139671397702432



In [23]:

    
number2.shape = (5,4)
print('number2 shape:', number2.shape,'\nnumber shape:', number.shape)









    



number2 shape: (5, 4) 
number shape: (20,)



In [24]:

    
print('befor:', number)
number2[0][0] = 2222
print()
print('after:', number)









    



befor: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

after: [2222    1    2    3    4    5    6    7    8    9   10   11   12   13   14
   15   16   17   18   19]

### Deep copy

The copy method makes a complete copy of the array and its data.



In [25]:

    
number = np.arange(0,20)
number2 = number.copy()
print (number is number2 , id(number), id(number2))









    



False 139671397701872 139671397732560



In [26]:

    
print('befor:', number)
number2[0] = 10
print()
print('after:', number)
print()
print('number2:',number2)









    



befor: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

after: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

number2: [10  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

Broadcating

One of important concept to understand numpy is Broadcasting

It's very useful for performancing mathmaica operation beetween arrays of different shape.



In [27]:

    
number = np.arange(1,11)
num = 2 
print(' number =', number)
print('\n number .* num =',number * num)









    



 number = [ 1  2  3  4  5  6  7  8  9 10]

 number .* num = [ 2  4  6  8 10 12 14 16 18 20]



In [28]:

    
number = np.arange(1,10).reshape(3,3)
number2 = np.arange(1,4).reshape(1,3)
number * number2









    Out[28]:





array([[ 1,  4,  9],
       [ 4, 10, 18],
       [ 7, 16, 27]])



In [29]:

    
number = np.array([1,2,3])
print('number =', number)
print('\nnumber =', number + 100)









    



number = [1 2 3]

number = [101 102 103]



In [30]:

    
number = np.arange(1,10).reshape(3,3)
number2 = np.arange(1,4)
print('number: \n', number)
add = number + number2 
print()
print('number2: \n ', number2)
print()
print('add: \n', add)









    



number: 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

number2: 
  [1 2 3]

add: 
 [[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]

If you still doubt Why we use Python and NumPy see it. 😉



In [31]:

    
from time import time
a = np.random.rand(8000000, 1)
c = 0
tic = time()
for i in range(len(a)):
    c +=(a[i][0] * a[i][0])
          
print ('output1:', c)
tak = time()

print('multiply 2 matrix with loop: ', tak - tic)

tic = time()
print('output2:', np.dot(a.T, a))
tak = time()


print('multiply 2 matrix with numpy func: ', tak - tic)









    



output1: 2665834.57759
multiply 2 matrix with loop:  4.754087448120117
output2: [[ 2665834.57759168]]
multiply 2 matrix with numpy func:  0.004221677780151367

I tried to write essential things in numpy that you can start to code and enjoy it but there are many function that i don't write in this book if you neet more informatino click here

Pandas

pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

For easy installing Pandas

sudo pip3 install pandas



In [6]:

    
import pandas as pd

Series



In [33]:

    
labels = ['a','b','c']
my_list = [10,20,30]
arr = np.array([10,20,30])
d = {'a':10,'b':20,'c':30}



In [34]:

    
pd.Series(data=my_list)









    Out[34]:





0    10
1    20
2    30
dtype: int64



In [35]:

    
pd.Series(data=my_list,index=labels)









    Out[35]:





a    10
b    20
c    30
dtype: int64



In [36]:

    
pd.Series(d)









    Out[36]:





a    10
b    20
c    30
dtype: int64

Dataframe

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure



In [7]:

    
dataframe = pd.DataFrame(np.random.randn(5,4),columns=['A','B','V','D'])



In [8]:

    
dataframe.head()

Selection



In [9]:

    
dataframe['A']









    Out[9]:





0   -1.194072
1   -0.030137
2    0.263154
3   -1.527798
4   -0.733531
Name: A, dtype: float64



In [10]:

    
dataframe[['A', 'D']]

creating new column



In [11]:

    
dataframe['E'] = dataframe['A'] + dataframe['B']



In [12]:

    
dataframe

removing a column



In [14]:

    
dataframe.drop('E', axis=1)



In [44]:

    
dataframe



In [45]:

    
dataframe.drop('E', axis=1, inplace=True)
dataframe

Selcting row



In [46]:

    
dataframe.loc[0]









    Out[46]:





A   -0.131864
B    0.478105
C    0.759782
D   -1.163273
Name: 0, dtype: float64



In [47]:

    
dataframe.iloc[0]









    Out[47]:





A   -0.131864
B    0.478105
C    0.759782
D   -1.163273
Name: 0, dtype: float64



In [48]:

    
dataframe.loc[0 , 'A']









    Out[48]:





-0.13186355473715136



In [49]:

    
dataframe.loc[[0,2],['A', 'C']]

Conditional Selection



In [50]:

    
dataframe > 0.3









    Out[50]:







  
    
      
      A
      B
      C
      D
    
  
  
    
      0
      False
      True
      True
      False
    
    
      1
      False
      True
      False
      True
    
    
      2
      False
      True
      True
      False
    
    
      3
      True
      True
      True
      True
    
    
      4
      False
      True
      False
      False



In [51]:

    
dataframe[dataframe > 0.3 ]



In [52]:

    
dataframe[dataframe['A']>0.3]



In [53]:

    
dataframe[dataframe['A']>0.3]['B']









    Out[53]:





3    0.393859
Name: B, dtype: float64



In [54]:

    
dataframe[(dataframe['A']>0.5) & (dataframe['C'] > 0)]

Multi-Index and Index Hierarchy



In [12]:

    
layer1 = ['g1','g1','g1','g2','g2','g2']
layer2 = [1,2,3,1,2,3]
hier_index = list(zip(layer1,layer2))
hier_index = pd.MultiIndex.from_tuples(hier_index)



In [13]:

    
hier_index









    Out[13]:





MultiIndex(levels=[['g1', 'g2'], [1, 2, 3]],
           labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]])



In [14]:

    
dataframe2 = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])



In [15]:

    
dataframe2



In [58]:

    
dataframe2.loc['g1']



In [59]:

    
dataframe2.loc['g1'].loc[1]









    Out[59]:





A   -0.125270
B   -0.492899
Name: 1, dtype: float64

Input and output



In [60]:

    
titanic = pd.read_csv('Datasets/titanic.csv')



In [ ]:

    
pd.read



In [61]:

    
titanic.head()









    Out[61]:







  
    
      
      PassengerId
      Survived
      Pclass
      Name
      Sex
      Age
      SibSp
      Parch
      Ticket
      Fare
      Cabin
      Embarked
    
  
  
    
      0
      1
      0
      3
      Braund, Mr. Owen Harris
      male
      22.0
      1
      0
      A/5 21171
      7.2500
      NaN
      S
    
    
      1
      2
      1
      1
      Cumings, Mrs. John Bradley (Florence Briggs Th...
      female
      38.0
      1
      0
      PC 17599
      71.2833
      C85
      C
    
    
      2
      3
      1
      3
      Heikkinen, Miss. Laina
      female
      26.0
      0
      0
      STON/O2. 3101282
      7.9250
      NaN
      S
    
    
      3
      4
      1
      1
      Futrelle, Mrs. Jacques Heath (Lily May Peel)
      female
      35.0
      1
      0
      113803
      53.1000
      C123
      S
    
    
      4
      5
      0
      3
      Allen, Mr. William Henry
      male
      35.0
      0
      0
      373450
      8.0500
      NaN
      S



In [62]:

    
titanic.drop('Name', axis=1 , inplace = True)



In [63]:

    
titanic.head()









    Out[63]:







  
    
      
      PassengerId
      Survived
      Pclass
      Sex
      Age
      SibSp
      Parch
      Ticket
      Fare
      Cabin
      Embarked
    
  
  
    
      0
      1
      0
      3
      male
      22.0
      1
      0
      A/5 21171
      7.2500
      NaN
      S
    
    
      1
      2
      1
      1
      female
      38.0
      1
      0
      PC 17599
      71.2833
      C85
      C
    
    
      2
      3
      1
      3
      female
      26.0
      0
      0
      STON/O2. 3101282
      7.9250
      NaN
      S
    
    
      3
      4
      1
      1
      female
      35.0
      1
      0
      113803
      53.1000
      C123
      S
    
    
      4
      5
      0
      3
      male
      35.0
      0
      0
      373450
      8.0500
      NaN
      S



In [64]:

    
titanic.to_csv('Datasets/titanic_drop_names.csv')

csv is one of the most important format but Pandas compatible with many other format like html table , sql, json and etc.

Mising data (NaN)



In [65]:

    
titanic.head()









    Out[65]:







  
    
      
      PassengerId
      Survived
      Pclass
      Sex
      Age
      SibSp
      Parch
      Ticket
      Fare
      Cabin
      Embarked
    
  
  
    
      0
      1
      0
      3
      male
      22.0
      1
      0
      A/5 21171
      7.2500
      NaN
      S
    
    
      1
      2
      1
      1
      female
      38.0
      1
      0
      PC 17599
      71.2833
      C85
      C
    
    
      2
      3
      1
      3
      female
      26.0
      0
      0
      STON/O2. 3101282
      7.9250
      NaN
      S
    
    
      3
      4
      1
      1
      female
      35.0
      1
      0
      113803
      53.1000
      C123
      S
    
    
      4
      5
      0
      3
      male
      35.0
      0
      0
      373450
      8.0500
      NaN
      S



In [66]:

    
titanic.dropna()









    Out[66]:







  
    
      
      PassengerId
      Survived
      Pclass
      Sex
      Age
      SibSp
      Parch
      Ticket
      Fare
      Cabin
      Embarked
    
  
  
    
      1
      2
      1
      1
      female
      38.0
      1
      0
      PC 17599
      71.2833
      C85
      C
    
    
      3
      4
      1
      1
      female
      35.0
      1
      0
      113803
      53.1000
      C123
      S
    
    
      6
      7
      0
      1
      male
      54.0
      0
      0
      17463
      51.8625
      E46
      S
    
    
      10
      11
      1
      3
      female
      4.0
      1
      1
      PP 9549
      16.7000
      G6
      S
    
    
      11
      12
      1
      1
      female
      58.0
      0
      0
      113783
      26.5500
      C103
      S
    
    
      21
      22
      1
      2
      male
      34.0
      0
      0
      248698
      13.0000
      D56
      S
    
    
      23
      24
      1
      1
      male
      28.0
      0
      0
      113788
      35.5000
      A6
      S
    
    
      27
      28
      0
      1
      male
      19.0
      3
      2
      19950
      263.0000
      C23 C25 C27
      S
    
    
      52
      53
      1
      1
      female
      49.0
      1
      0
      PC 17572
      76.7292
      D33
      C
    
    
      54
      55
      0
      1
      male
      65.0
      0
      1
      113509
      61.9792
      B30
      C
    
    
      62
      63
      0
      1
      male
      45.0
      1
      0
      36973
      83.4750
      C83
      S
    
    
      66
      67
      1
      2
      female
      29.0
      0
      0
      C.A. 29395
      10.5000
      F33
      S
    
    
      75
      76
      0
      3
      male
      25.0
      0
      0
      348123
      7.6500
      F G73
      S
    
    
      88
      89
      1
      1
      female
      23.0
      3
      2
      19950
      263.0000
      C23 C25 C27
      S
    
    
      92
      93
      0
      1
      male
      46.0
      1
      0
      W.E.P. 5734
      61.1750
      E31
      S
    
    
      96
      97
      0
      1
      male
      71.0
      0
      0
      PC 17754
      34.6542
      A5
      C
    
    
      97
      98
      1
      1
      male
      23.0
      0
      1
      PC 17759
      63.3583
      D10 D12
      C
    
    
      102
      103
      0
      1
      male
      21.0
      0
      1
      35281
      77.2875
      D26
      S
    
    
      110
      111
      0
      1
      male
      47.0
      0
      0
      110465
      52.0000
      C110
      S
    
    
      118
      119
      0
      1
      male
      24.0
      0
      1
      PC 17558
      247.5208
      B58 B60
      C
    
    
      123
      124
      1
      2
      female
      32.5
      0
      0
      27267
      13.0000
      E101
      S
    
    
      124
      125
      0
      1
      male
      54.0
      0
      1
      35281
      77.2875
      D26
      S
    
    
      136
      137
      1
      1
      female
      19.0
      0
      2
      11752
      26.2833
      D47
      S
    
    
      137
      138
      0
      1
      male
      37.0
      1
      0
      113803
      53.1000
      C123
      S
    
    
      139
      140
      0
      1
      male
      24.0
      0
      0
      PC 17593
      79.2000
      B86
      C
    
    
      148
      149
      0
      2
      male
      36.5
      0
      2
      230080
      26.0000
      F2
      S
    
    
      151
      152
      1
      1
      female
      22.0
      1
      0
      113776
      66.6000
      C2
      S
    
    
      170
      171
      0
      1
      male
      61.0
      0
      0
      111240
      33.5000
      B19
      S
    
    
      174
      175
      0
      1
      male
      56.0
      0
      0
      17764
      30.6958
      A7
      C
    
    
      177
      178
      0
      1
      female
      50.0
      0
      0
      PC 17595
      28.7125
      C49
      C
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      737
      738
      1
      1
      male
      35.0
      0
      0
      PC 17755
      512.3292
      B101
      C
    
    
      741
      742
      0
      1
      male
      36.0
      1
      0
      19877
      78.8500
      C46
      S
    
    
      742
      743
      1
      1
      female
      21.0
      2
      2
      PC 17608
      262.3750
      B57 B59 B63 B66
      C
    
    
      745
      746
      0
      1
      male
      70.0
      1
      1
      WE/P 5735
      71.0000
      B22
      S
    
    
      748
      749
      0
      1
      male
      19.0
      1
      0
      113773
      53.1000
      D30
      S
    
    
      751
      752
      1
      3
      male
      6.0
      0
      1
      392096
      12.4750
      E121
      S
    
    
      759
      760
      1
      1
      female
      33.0
      0
      0
      110152
      86.5000
      B77
      S
    
    
      763
      764
      1
      1
      female
      36.0
      1
      2
      113760
      120.0000
      B96 B98
      S
    
    
      765
      766
      1
      1
      female
      51.0
      1
      0
      13502
      77.9583
      D11
      S
    
    
      772
      773
      0
      2
      female
      57.0
      0
      0
      S.O./P.P. 3
      10.5000
      E77
      S
    
    
      779
      780
      1
      1
      female
      43.0
      0
      1
      24160
      211.3375
      B3
      S
    
    
      781
      782
      1
      1
      female
      17.0
      1
      0
      17474
      57.0000
      B20
      S
    
    
      782
      783
      0
      1
      male
      29.0
      0
      0
      113501
      30.0000
      D6
      S
    
    
      789
      790
      0
      1
      male
      46.0
      0
      0
      PC 17593
      79.2000
      B82 B84
      C
    
    
      796
      797
      1
      1
      female
      49.0
      0
      0
      17465
      25.9292
      D17
      S
    
    
      802
      803
      1
      1
      male
      11.0
      1
      2
      113760
      120.0000
      B96 B98
      S
    
    
      806
      807
      0
      1
      male
      39.0
      0
      0
      112050
      0.0000
      A36
      S
    
    
      809
      810
      1
      1
      female
      33.0
      1
      0
      113806
      53.1000
      E8
      S
    
    
      820
      821
      1
      1
      female
      52.0
      1
      1
      12749
      93.5000
      B69
      S
    
    
      823
      824
      1
      3
      female
      27.0
      0
      1
      392096
      12.4750
      E121
      S
    
    
      835
      836
      1
      1
      female
      39.0
      1
      1
      PC 17756
      83.1583
      E49
      C
    
    
      853
      854
      1
      1
      female
      16.0
      0
      1
      PC 17592
      39.4000
      D28
      S
    
    
      857
      858
      1
      1
      male
      51.0
      0
      0
      113055
      26.5500
      E17
      S
    
    
      862
      863
      1
      1
      female
      48.0
      0
      0
      17466
      25.9292
      D17
      S
    
    
      867
      868
      0
      1
      male
      31.0
      0
      0
      PC 17590
      50.4958
      A24
      S
    
    
      871
      872
      1
      1
      female
      47.0
      1
      1
      11751
      52.5542
      D35
      S
    
    
      872
      873
      0
      1
      male
      33.0
      0
      0
      695
      5.0000
      B51 B53 B55
      S
    
    
      879
      880
      1
      1
      female
      56.0
      0
      1
      11767
      83.1583
      C50
      C
    
    
      887
      888
      1
      1
      female
      19.0
      0
      0
      112053
      30.0000
      B42
      S
    
    
      889
      890
      1
      1
      male
      26.0
      0
      0
      111369
      30.0000
      C148
      C
    
  

183 rows × 11 columns



In [67]:

    
titanic.dropna(axis=1)









    Out[67]:







  
    
      
      PassengerId
      Survived
      Pclass
      Sex
      SibSp
      Parch
      Ticket
      Fare
    
  
  
    
      0
      1
      0
      3
      male
      1
      0
      A/5 21171
      7.2500
    
    
      1
      2
      1
      1
      female
      1
      0
      PC 17599
      71.2833
    
    
      2
      3
      1
      3
      female
      0
      0
      STON/O2. 3101282
      7.9250
    
    
      3
      4
      1
      1
      female
      1
      0
      113803
      53.1000
    
    
      4
      5
      0
      3
      male
      0
      0
      373450
      8.0500
    
    
      5
      6
      0
      3
      male
      0
      0
      330877
      8.4583
    
    
      6
      7
      0
      1
      male
      0
      0
      17463
      51.8625
    
    
      7
      8
      0
      3
      male
      3
      1
      349909
      21.0750
    
    
      8
      9
      1
      3
      female
      0
      2
      347742
      11.1333
    
    
      9
      10
      1
      2
      female
      1
      0
      237736
      30.0708
    
    
      10
      11
      1
      3
      female
      1
      1
      PP 9549
      16.7000
    
    
      11
      12
      1
      1
      female
      0
      0
      113783
      26.5500
    
    
      12
      13
      0
      3
      male
      0
      0
      A/5. 2151
      8.0500
    
    
      13
      14
      0
      3
      male
      1
      5
      347082
      31.2750
    
    
      14
      15
      0
      3
      female
      0
      0
      350406
      7.8542
    
    
      15
      16
      1
      2
      female
      0
      0
      248706
      16.0000
    
    
      16
      17
      0
      3
      male
      4
      1
      382652
      29.1250
    
    
      17
      18
      1
      2
      male
      0
      0
      244373
      13.0000
    
    
      18
      19
      0
      3
      female
      1
      0
      345763
      18.0000
    
    
      19
      20
      1
      3
      female
      0
      0
      2649
      7.2250
    
    
      20
      21
      0
      2
      male
      0
      0
      239865
      26.0000
    
    
      21
      22
      1
      2
      male
      0
      0
      248698
      13.0000
    
    
      22
      23
      1
      3
      female
      0
      0
      330923
      8.0292
    
    
      23
      24
      1
      1
      male
      0
      0
      113788
      35.5000
    
    
      24
      25
      0
      3
      female
      3
      1
      349909
      21.0750
    
    
      25
      26
      1
      3
      female
      1
      5
      347077
      31.3875
    
    
      26
      27
      0
      3
      male
      0
      0
      2631
      7.2250
    
    
      27
      28
      0
      1
      male
      3
      2
      19950
      263.0000
    
    
      28
      29
      1
      3
      female
      0
      0
      330959
      7.8792
    
    
      29
      30
      0
      3
      male
      0
      0
      349216
      7.8958
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      861
      862
      0
      2
      male
      1
      0
      28134
      11.5000
    
    
      862
      863
      1
      1
      female
      0
      0
      17466
      25.9292
    
    
      863
      864
      0
      3
      female
      8
      2
      CA. 2343
      69.5500
    
    
      864
      865
      0
      2
      male
      0
      0
      233866
      13.0000
    
    
      865
      866
      1
      2
      female
      0
      0
      236852
      13.0000
    
    
      866
      867
      1
      2
      female
      1
      0
      SC/PARIS 2149
      13.8583
    
    
      867
      868
      0
      1
      male
      0
      0
      PC 17590
      50.4958
    
    
      868
      869
      0
      3
      male
      0
      0
      345777
      9.5000
    
    
      869
      870
      1
      3
      male
      1
      1
      347742
      11.1333
    
    
      870
      871
      0
      3
      male
      0
      0
      349248
      7.8958
    
    
      871
      872
      1
      1
      female
      1
      1
      11751
      52.5542
    
    
      872
      873
      0
      1
      male
      0
      0
      695
      5.0000
    
    
      873
      874
      0
      3
      male
      0
      0
      345765
      9.0000
    
    
      874
      875
      1
      2
      female
      1
      0
      P/PP 3381
      24.0000
    
    
      875
      876
      1
      3
      female
      0
      0
      2667
      7.2250
    
    
      876
      877
      0
      3
      male
      0
      0
      7534
      9.8458
    
    
      877
      878
      0
      3
      male
      0
      0
      349212
      7.8958
    
    
      878
      879
      0
      3
      male
      0
      0
      349217
      7.8958
    
    
      879
      880
      1
      1
      female
      0
      1
      11767
      83.1583
    
    
      880
      881
      1
      2
      female
      0
      1
      230433
      26.0000
    
    
      881
      882
      0
      3
      male
      0
      0
      349257
      7.8958
    
    
      882
      883
      0
      3
      female
      0
      0
      7552
      10.5167
    
    
      883
      884
      0
      2
      male
      0
      0
      C.A./SOTON 34068
      10.5000
    
    
      884
      885
      0
      3
      male
      0
      0
      SOTON/OQ 392076
      7.0500
    
    
      885
      886
      0
      3
      female
      0
      5
      382652
      29.1250
    
    
      886
      887
      0
      2
      male
      0
      0
      211536
      13.0000
    
    
      887
      888
      1
      1
      female
      0
      0
      112053
      30.0000
    
    
      888
      889
      0
      3
      female
      1
      2
      W./C. 6607
      23.4500
    
    
      889
      890
      1
      1
      male
      0
      0
      111369
      30.0000
    
    
      890
      891
      0
      3
      male
      0
      0
      370376
      7.7500
    
  

891 rows × 8 columns



In [68]:

    
titanic.fillna('Fill NaN').head()









    Out[68]:







  
    
      
      PassengerId
      Survived
      Pclass
      Sex
      Age
      SibSp
      Parch
      Ticket
      Fare
      Cabin
      Embarked
    
  
  
    
      0
      1
      0
      3
      male
      22
      1
      0
      A/5 21171
      7.2500
      Fill NaN
      S
    
    
      1
      2
      1
      1
      female
      38
      1
      0
      PC 17599
      71.2833
      C85
      C
    
    
      2
      3
      1
      3
      female
      26
      0
      0
      STON/O2. 3101282
      7.9250
      Fill NaN
      S
    
    
      3
      4
      1
      1
      female
      35
      1
      0
      113803
      53.1000
      C123
      S
    
    
      4
      5
      0
      3
      male
      35
      0
      0
      373450
      8.0500
      Fill NaN
      S

Concating merging and ...



In [16]:

    
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                        'B': ['B0', 'B1', 'B2', 'B3'],
                        'C': ['C0', 'C1', 'C2', 'C3'],
                        'D': ['D0', 'D1', 'D2', 'D3']},
                        index=[0, 1, 2, 3])

df2 = pd.DataFrame({'A': ['A4', 'A5', 'A6', 'A7'],
                        'B': ['B4', 'B5', 'B6', 'B7'],
                        'C': ['C4', 'C5', 'C6', 'C7'],
                        'D': ['D4', 'D5', 'D6', 'D7']},
                         index=[4, 5, 6, 7]) 

df3 = pd.DataFrame({'A': ['A8', 'A9', 'A10', 'A11'],
                        'B': ['B8', 'B9', 'B10', 'B11'],
                        'C': ['C8', 'C9', 'C10', 'C11'],
                        'D': ['D8', 'D9', 'D10', 'D11']},
                        index=[8, 9, 10, 11])



In [17]:

    
df1



In [18]:

    
df2



In [19]:

    
df3

Concatenation



In [20]:

    
frames = [df1, df2, df3 ]



In [21]:

    
pd.concat(frames)
#pd.concat(frames, ignore_index=True)



In [22]:

    
pd.concat(frames, axis=1)









    Out[22]:







  
    
      
      A
      B
      C
      D
      A
      B
      C
      D
      A
      B
      C
      D
    
  
  
    
      0
      A0
      B0
      C0
      D0
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      1
      A1
      B1
      C1
      D1
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      2
      A2
      B2
      C2
      D2
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      3
      A3
      B3
      C3
      D3
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
    
    
      4
      NaN
      NaN
      NaN
      NaN
      A4
      B4
      C4
      D4
      NaN
      NaN
      NaN
      NaN
    
    
      5
      NaN
      NaN
      NaN
      NaN
      A5
      B5
      C5
      D5
      NaN
      NaN
      NaN
      NaN
    
    
      6
      NaN
      NaN
      NaN
      NaN
      A6
      B6
      C6
      D6
      NaN
      NaN
      NaN
      NaN
    
    
      7
      NaN
      NaN
      NaN
      NaN
      A7
      B7
      C7
      D7
      NaN
      NaN
      NaN
      NaN
    
    
      8
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      A8
      B8
      C8
      D8
    
    
      9
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      A9
      B9
      C9
      D9
    
    
      10
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      A10
      B10
      C10
      D10
    
    
      11
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      NaN
      A11
      B11
      C11
      D11



In [23]:

    
df1.append(df2)

Mergeing



In [77]:

    
left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                     'A': ['A0', 'A1', 'A2', 'A3'],
                     'B': ['B0', 'B1', 'B2', 'B3']})
   
right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                          'C': ['C0', 'C1', 'C2', 'C3'],
                          'D': ['D0', 'D1', 'D2', 'D3']})



In [78]:

    
left



In [79]:

    
right



In [80]:

    
pd.merge(left, right, on= 'key')



In [81]:

    
left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
                     'key2': ['K0', 'K1', 'K0', 'K1'],
                        'A': ['A0', 'A1', 'A2', 'A3'],
                        'B': ['B0', 'B1', 'B2', 'B3']})
    
right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
                               'key2': ['K0', 'K0', 'K0', 'K0'],
                                  'C': ['C0', 'C1', 'C2', 'C3'],
                                  'D': ['D0', 'D1', 'D2', 'D3']})



In [82]:

    
pd.merge(left, right, on=['key1', 'key2'])



In [83]:

    
pd.merge(left, right, how='outer', on=['key1', 'key2'])



In [84]:

    
pd.merge(left, right, how='left', on=['key1', 'key2'])



In [85]:

    
pd.merge(left, right, how='right', on=['key1', 'key2'])

Joining



In [86]:

    
left = pd.DataFrame({'A': ['A0', 'A1', 'A2'],
                     'B': ['B0', 'B1', 'B2']},
                      index=['K0', 'K1', 'K2']) 

right = pd.DataFrame({'C': ['C0', 'C2', 'C3'],
                    'D': ['D0', 'D2', 'D3']},
                      index=['K0', 'K2', 'K3'])



In [87]:

    
left



In [88]:

    
right



In [89]:

    
left.join(right)

	A	B	V	D
0	-1.194072	-0.028099	-0.948889	-1.531588
1	-0.030137	-0.732911	0.604193	1.568273
2	0.263154	-1.560707	1.182750	-0.521581
3	-1.527798	-0.577411	-1.407768	-1.287491
4	-0.733531	0.098774	-1.348715	0.350558

	A	B	V	D	E
0	-1.194072	-0.028099	-0.948889	-1.531588	-1.222172
1	-0.030137	-0.732911	0.604193	1.568273	-0.763048
2	0.263154	-1.560707	1.182750	-0.521581	-1.297553
3	-1.527798	-0.577411	-1.407768	-1.287491	-2.105209
4	-0.733531	0.098774	-1.348715	0.350558	-0.634756

	A	B	V	D
0	-1.194072	-0.028099	-0.948889	-1.531588
1	-0.030137	-0.732911	0.604193	1.568273
2	0.263154	-1.560707	1.182750	-0.521581
3	-1.527798	-0.577411	-1.407768	-1.287491
4	-0.733531	0.098774	-1.348715	0.350558

	A	B	C	D	E
0	-0.131864	0.478105	0.759782	-1.163273	0.346242
1	-1.201529	1.419080	-0.180453	0.682591	0.217551
2	-0.538697	0.521623	0.565700	-0.169198	-0.017073
3	0.573575	0.393859	2.964976	1.436765	0.967434
4	-1.053742	1.134712	-0.165858	-0.389600	0.080970

	A	B	C	D
0	-0.131864	0.478105	0.759782	-1.163273
1	-1.201529	1.419080	-0.180453	0.682591
2	-0.538697	0.521623	0.565700	-0.169198
3	0.573575	0.393859	2.964976	1.436765
4	-1.053742	1.134712	-0.165858	-0.389600

	A	B	C	D
0	False	True	True	False
1	False	True	False	True
2	False	True	True	False
3	True	True	True	True
4	False	True	False	False

		A	B
g1	1	1.327113	-0.870679
	2	0.258946	1.492455
	3	2.041487	-0.101779
g2	1	-0.465014	-2.738942
	2	0.666121	-1.009980
	3	-0.459053	0.128703

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Ticket	Fare	Cabin	Embarked
0	1	0	3	Braund, Mr. Owen Harris	male	22.0	1	A/5 21171	7.2500	NaN	S
1	2	1	1	Cumings, Mrs. John Bradley (Florence Briggs Th...	female	38.0	1	PC 17599	71.2833	C85	C
2	3	1	3	Heikkinen, Miss. Laina	female	26.0	0	STON/O2. 3101282	7.9250	NaN	S
3	4	1	1	Futrelle, Mrs. Jacques Heath (Lily May Peel)	female	35.0	1	113803	53.1000	C123	S
4	5	0	3	Allen, Mr. William Henry	male	35.0	0	373450	8.0500	NaN	S

	A	B	C	D
0	A0	B0	C0	D0
1	A1	B1	C1	D1
2	A2	B2	C2	D2
3	A3	B3	C3	D3
4	A4	B4	C4	D4
5	A5	B5	C5	D5
6	A6	B6	C6	D6
7	A7	B7	C7	D7
8	A8	B8	C8	D8
9	A9	B9	C9	D9
10	A10	B10	C10	D10
11	A11	B11	C11	D11

	A	B	C	D	A	B	C	D	A	B	C	D
0	A0	B0	C0	D0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	A1	B1	C1	D1	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	A2	B2	C2	D2	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	A3	B3	C3	D3	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	NaN	NaN	NaN	NaN	A4	B4	C4	D4	NaN	NaN	NaN	NaN
5	NaN	NaN	NaN	NaN	A5	B5	C5	D5	NaN	NaN	NaN	NaN
6	NaN	NaN	NaN	NaN	A6	B6	C6	D6	NaN	NaN	NaN	NaN
7	NaN	NaN	NaN	NaN	A7	B7	C7	D7	NaN	NaN	NaN	NaN
8	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A8	B8	C8	D8
9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A9	B9	C9	D9
10	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A10	B10	C10	D10
11	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A11	B11	C11	D11

	A	B	C	D
0	A0	B0	C0	D0
1	A1	B1	C1	D1
2	A2	B2	C2	D2
3	A3	B3	C3	D3
4	A4	B4	C4	D4
5	A5	B5	C5	D5
6	A6	B6	C6	D6
7	A7	B7	C7	D7

	A	B	key1	key2	C	D
0	A0	B0	K0	K0	C0	D0
1	A1	B1	K0	K1	NaN	NaN
2	A2	B2	K1	K0	C1	D1
3	A2	B2	K1	K0	C2	D2
4	A3	B3	K2	K1	NaN	NaN
5	NaN	NaN	K2	K0	C3	D3

	A	B	key1	key2	C	D
0	A0	B0	K0	K0	C0	D0
1	A1	B1	K0	K1	NaN	NaN
2	A2	B2	K1	K0	C1	D1
3	A2	B2	K1	K0	C2	D2
4	A3	B3	K2	K1	NaN	NaN

	A	B	C	D
0	A0	B0	C0	D0
1	A1	B1	C1	D1
2	A2	B2	C2	D2
3	A3	B3	C3	D3
4	A4	B4	C4	D4
5	A5	B5	C5	D5
6	A6	B6	C6	D6
7	A7	B7	C7	D7
8	A8	B8	C8	D8
9	A9	B9	C9	D9
10	A10	B10	C10	D10
11	A11	B11	C11	D11

	A	B	C	D	A	B	C	D	A	B	C	D
0	A0	B0	C0	D0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	A1	B1	C1	D1	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	A2	B2	C2	D2	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	A3	B3	C3	D3	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	NaN	NaN	NaN	NaN	A4	B4	C4	D4	NaN	NaN	NaN	NaN
5	NaN	NaN	NaN	NaN	A5	B5	C5	D5	NaN	NaN	NaN	NaN
6	NaN	NaN	NaN	NaN	A6	B6	C6	D6	NaN	NaN	NaN	NaN
7	NaN	NaN	NaN	NaN	A7	B7	C7	D7	NaN	NaN	NaN	NaN
8	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A8	B8	C8	D8
9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A9	B9	C9	D9
10	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A10	B10	C10	D10
11	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A11	B11	C11	D11

	A	B	C	D
0	A0	B0	C0	D0
1	A1	B1	C1	D1
2	A2	B2	C2	D2
3	A3	B3	C3	D3
4	A4	B4	C4	D4
5	A5	B5	C5	D5
6	A6	B6	C6	D6
7	A7	B7	C7	D7

	A	B	key1	key2	C	D
0	A0	B0	K0	K0	C0	D0
1	A1	B1	K0	K1	NaN	NaN
2	A2	B2	K1	K0	C1	D1
3	A2	B2	K1	K0	C2	D2
4	A3	B3	K2	K1	NaN	NaN
5	NaN	NaN	K2	K0	C3	D3

	A	B	key1	key2	C	D
0	A0	B0	K0	K0	C0	D0
1	A1	B1	K0	K1	NaN	NaN
2	A2	B2	K1	K0	C1	D1
3	A2	B2	K1	K0	C2	D2
4	A3	B3	K2	K1	NaN	NaN

	A	B	C	D
0	A0	B0	C0	D0
1	A1	B1	C1	D1
2	A2	B2	C2	D2
3	A3	B3	C3	D3
4	A4	B4	C4	D4
5	A5	B5	C5	D5
6	A6	B6	C6	D6
7	A7	B7	C7	D7
8	A8	B8	C8	D8
9	A9	B9	C9	D9
10	A10	B10	C10	D10
11	A11	B11	C11	D11

	A	B	C	D	A	B	C	D	A	B	C	D
0	A0	B0	C0	D0	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	A1	B1	C1	D1	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	A2	B2	C2	D2	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	A3	B3	C3	D3	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	NaN	NaN	NaN	NaN	A4	B4	C4	D4	NaN	NaN	NaN	NaN
5	NaN	NaN	NaN	NaN	A5	B5	C5	D5	NaN	NaN	NaN	NaN
6	NaN	NaN	NaN	NaN	A6	B6	C6	D6	NaN	NaN	NaN	NaN
7	NaN	NaN	NaN	NaN	A7	B7	C7	D7	NaN	NaN	NaN	NaN
8	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A8	B8	C8	D8
9	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A9	B9	C9	D9
10	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A10	B10	C10	D10
11	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	A11	B11	C11	D11

	A	B	C	D
0	A0	B0	C0	D0
1	A1	B1	C1	D1
2	A2	B2	C2	D2
3	A3	B3	C3	D3
4	A4	B4	C4	D4
5	A5	B5	C5	D5
6	A6	B6	C6	D6
7	A7	B7	C7	D7

	A	B	key1	key2	C	D
0	A0	B0	K0	K0	C0	D0
1	A1	B1	K0	K1	NaN	NaN
2	A2	B2	K1	K0	C1	D1
3	A2	B2	K1	K0	C2	D2
4	A3	B3	K2	K1	NaN	NaN
5	NaN	NaN	K2	K0	C3	D3

	A	B	key1	key2	C	D
0	A0	B0	K0	K0	C0	D0
1	A1	B1	K0	K1	NaN	NaN
2	A2	B2	K1	K0	C1	D1
3	A2	B2	K1	K0	C2	D2
4	A3	B3	K2	K1	NaN	NaN