# pandas

THis notebook records some tips for the pandas module



In [1]:

    
import pandas as pd
import numpy as np

Create dataframe

Create a dataframe of random integers



In [2]:

    
df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))
df

.loc

use .loc to select both rows and columns by label based indexing. The labels being the values of the index or the columns. Slicing with .loc includes the last element.



In [3]:

    
df.loc[3:9:2, 'B':]

Change index and to_datetime

Use to_datetime to convert the 'time' column to pandas's time format and set the index of dataframe to the column that records the time of data.



In [4]:

    
df2 = pd.DataFrame([['2017-01-01', 253, 234], ['2017-02-04', 283, 333], ['2017-02-11', 3, 55]], columns=['time', 'data1', 'data2'])
df2



In [5]:

    
df2.index = pd.to_datetime(df2.pop('time'))
df2

	A	B	C	D
0	48	20	45	57
1	59	78	37	89
2	58	56	71	42
3	39	33	86	2
4	75	89	56	1
5	77	45	11	23
6	44	55	38	22
7	30	37	99	93
8	60	80	80	11
9	53	64	82	68

	B	C	D
3	33	86	2
5	45	11	23
7	37	99	93
9	64	82	68

	A	B	C	D
0	48	20	45	57
1	59	78	37	89
2	58	56	71	42
3	39	33	86	2
4	75	89	56	1
5	77	45	11	23
6	44	55	38	22
7	30	37	99	93
8	60	80	80	11
9	53	64	82	68

	A	B	C	D
0	48	20	45	57
1	59	78	37	89
2	58	56	71	42
3	39	33	86	2
4	75	89	56	1
5	77	45	11	23
6	44	55	38	22
7	30	37	99	93
8	60	80	80	11
9	53	64	82	68

	A	B	C	D
0	48	20	45	57
1	59	78	37	89
2	58	56	71	42
3	39	33	86	2
4	75	89	56	1
5	77	45	11	23
6	44	55	38	22
7	30	37	99	93
8	60	80	80	11
9	53	64	82	68