utilities: usage and examples

Package import


In [3]:
import sys
sys.executable


Out[3]:
'/usr/bin/python'

In [1]:
import pandas as pd
import utilities as ut

I. Loading text, csv, excel files from disk

Example 1: Loading .csv file as pandas dataframe


In [4]:
filename = '../data/test/test2.csv'
ex1 = ut.ReadTableFromFile(filename, sep=',')
ex1.to_pandas_df()

In [5]:
ex1.table


Out[5]:
Col1 Col2 Col3
0 A B C
1 D E F
2 G H I
3 J K L

In [8]:
t1 = ex1.table.to_dict()

In [15]:
t2 = pd.DataFrame.from_dict(t1, orient='index')

In [16]:
t2


Out[16]:
0 1 2 3
Col1 A D G J
Col2 B E H K
Col3 C F I L

Example 2: Loading .xlsx file as pandas dataframe


In [2]:
filename = '../data/test/test3.xlsx'
ex2 = ut.ReadTableFromFile(filename)
ex2.to_pandas_df()

In [3]:
ex2.table


Out[3]:
Col1 Col2 Col3
0 A B C
1 D E F
2 G H I
3 J K L

Example 3: Loading .txt file as list


In [2]:
filename = '../data/test/test4.txt'
ex3 = ut.ReadTableFromFile(filename, header=None)
ex3.to_2d_list()

In [3]:
ex3.itemlist


Out[3]:
[['A', '1'], ['B', '2'], ['D', '4'], ['F', '6']]

Example 4: Loading .txt file as dict


In [5]:
filename = '../data/test/test4.txt'
ex4 = ut.ReadTableFromFile(filename, header=None, index_col=0)
ex4.to_dict()

In [6]:
ex4.itemdict


Out[6]:
{'A': '1', 'B': '2', 'D': '4', 'F': '6'}

II.A. Saving / loading pandas dataframes (table attribute of ReadTableFromFile class)


In [2]:
filename = '../data/test/test2.csv'
ex1 = ut.ReadTableFromFile(filename, sep=',')
ex1.to_pandas_df()

Saving to disk


In [5]:
# ex1.table.to_csv('../data/test/test2_copy.csv')
# ex1.table.to_json('../data/test/test2_copy.json')
ex1.table.to_msgpack('../data/test/test2_copy.msg')

Loading back as pandas dataframe


In [ ]:
# df = pd.read_json('../data/test/test2_copy.json')
df = pd.read_msgpack('../data/test/test2_copy.msg')

II.B. Saving / loading python objects (of any type)

Saving to disk


In [2]:
itemdict = {'A': '1', 'B': '2', 'D': '4', 'F': '6'}
# ut.save_pkl(itemdict, '../data/test/itemdict') # Do not mention extension
# ut.save_json(itemdict, '../data/test/itemdict') 
ut.save_msg(itemdict, '../data/test/itemdict')

Loading from disk


In [2]:
# obj = ut.load_pkl('../data/test/itemdict') # Do not mention extension
# obj = ut.load_json('../data/test/itemdict') 
obj = ut.load_msg('../data/test/itemdict')

III.A. Conversion: Pandas dataframes to / from dictionaries


In [ ]:
filename = '../data/test/test2.csv'
ex1 = ut.ReadTableFromFile(filename, sep=',')
ex1.to_pandas_df()

df to dict


In [ ]:
itemdict = ex1.table.to_dict()

dict to df


In [19]:
df = pd.DataFrame.from_dict(itemdict, orient='columns') # orient set to 'index' => keys become indices of df

III.B. Conversion: Pandas dataframes to / from R dataframes

Rdf to df


In [2]:
filename = '../data/test/mtcars.Rda'
dataframe_name = 'mtcars'
df = ut.r2pandas_df(filename, dataframe_name)

df to Rdf


In [4]:
filename = '../data/test/mtcars2.Rda'
dataframe_name = 'mtcars2'
ut.pandas2r_df(df, dataframe_name, filename)