In [1]:
import swat
conn = swat.CAS(host, port, username, password)
Now we need to get some data into our session.
In [2]:
cls = conn.read_csv('https://raw.githubusercontent.com/sassoftware/sas-viya-programming/master/data/class.csv',
casout=dict(name='class', caslib='casuser'))
cls
Out[2]:
datastep.runcode
ActionThe most basic was to run data step code is using the datastep.runcode action directly. This action runs very much like running data step in SAS. You simply specify CAS tables rather than SAS data sets as your input and output data. In this example, we will comput the body mass index (BMI) of the students in the class data set. The output of the datastep.runcode action will contain two keys: inputTables and outputTables. Each of those keys points to a DataFrame of the information about the input and output tables including a CASTable object in the last column.
In [3]:
out = conn.datastep.runcode('''
data bmi(caslib='casuser');
set class(caslib='casuser');
BMI = weight / (height**2) * 703;
run;
''')
out
Out[3]:
We can pull the output table DataFrame out using the following line of code. The ix property is a DataFrame property that allows you to extract elements from a DataFrame at indexes or labels. In this case, we want the element in row zero, column name casTable.
In [4]:
bmi = out.OutputCasTables.ix[0, 'casTable']
bmi.to_frame()
Out[4]:
As you can see, we have a new CAS table that now includes the BMI column.
datastep
MethodCASTable objects have a datastep method that does some of the work of wrapping your data step code with the appropriate input and output data sets. When using this method, you just give the body of the data step code. The output table name will be automatically generated. In this case, the output of the method is a CASTable object that references the newly generated table, so you don't have to extract the CASTable from the underlying action results.
In [5]:
bmi2 = cls.datastep('''BMI = weight / (height**2) * 703''')
bmi2.to_frame()
Out[5]:
casds
IPython Magic CommandThe third way of running data step from Python is reserved for IPython users. IPython has commands that are called "magics". These commands start with % (for one line commands) or %% (for cell commands) and allow extension developers to add functionality that isn't necessarily Python-based to your environment. Included in SWAT is a packgae called swat.cas.magics that can be loaded to surface the %%casds magic command. The %%casds magic gives you the ability to enter an entire IPython cell of data step code rather than Python code. This is especially useful in the IPython notebook interface.
Let's give the %%casds magic a try. First we have to load the swat.cas.magics extension.
In [6]:
%load_ext swat.cas.magics
Now we can use the %%casds magic to enter an entire cell of data step code. The %casds magic requires at least one argument which contains the CAS connection object where the action should run. In most cases, you'll want to add the --output option as well which specifies the name of an output variable that will be surfaced to the Python environment which contains the output of the datastep.runcode action.
In [7]:
%%casds --output out2 conn
data bmi3(caslib='casuser');
set class(caslib='casuser');
BMI = weight / (height**2) * 703;
run;
Out[7]:
Just as before, we can extract the output CASTable object from the returned DataFrames.
In [8]:
bmi3 = out2.OutputCasTables.ix[0, 'casTable']
bmi3.to_frame()
Out[8]:
If you are an existing SAS user, you may be relieved to find that you can still use data step in the CAS environment. Even better, you can run it from Python. This blend of languages and environments gives you an enormous number of possibilities for data analysis, and should make SAS programmers feel right at home in Python.
In [9]:
conn.close()
In [ ]: