In the lab we commonly use SAS as our main statistics package. There are times where you want to take a SAS dataset and do something with it in python. Typically you would have to use SAS to export the dataset as a CSV and then import it in python. However, there is a python module that allow you to directly access a SAS dataset and dump its contents into a pandas dataframe.
Install the sas7bdat library.
pip install sas7bdat --user
Once installed you can do the following.
In [1]:
import sas7bdat
In [5]:
with sas7bdat.SAS7BDAT('fbgn2coord.sas7bdat') as FH:
df = FH.to_data_frame()
In [8]:
df.head()
Out[8]:
This package also comes with a nice command line tool that allows you to get the list of columns from a file.
In [9]:
%%bash
sas7bdat_to_csv --header fbgn2coord.sas7bdat
In [ ]: