Apache Spot's Ipython Advanced Mode

Proxy

This guide provides examples about how to request data, show data with some cool libraries like pandas and more.

Import Libraries

The next cell will import the necessary libraries to execute the functions. Do not remove


In [ ]:
import datetime
import pandas as pd
import numpy as np
import linecache, bisect
import os

spath = os.getcwd()
path = spath.split("/")
date = path[len(path)-1]

Request Data

In order to request data we are using Graphql (a query language for APIs, more info at: http://graphql.org/).

We provide the function to make a data request, all you need is a query and variables


In [ ]:
def makeGraphqlRequest(query, variables):
    return GraphQLClient.request(query, variables)

Now that we have a function, we can run a query like this:

*Note: There's no need to manually set the date for the query, by default the code will read the date from the current path


In [ ]:
suspicious_query = """query($date:SpotDateType) {
                            proxy {
                              suspicious(date:$date)
                                  { clientIp
                                    clientToServerBytes
                                    datetime
                                    duration
                                    host
                                    networkContext
                                    referer
                                    requestMethod
                                    responseCode
                                    responseCodeLabel
                                    responseContentType
                                    score
                                    serverIp
                                    serverToClientBytes
                                    uri
                                    uriPath
                                    uriPort
                                    uriQuery
                                    uriRep
                                    userAgent
                                    username
                                    webCategory                                    
                                  }
                            }
                    }"""

##If you want to use a different date for your query, switch the 
##commented/uncommented following lines

variables={
    'date': datetime.datetime.strptime(date, '%Y%m%d').strftime('%Y-%m-%d')
#     'date': "2016-10-08"
    }
 
suspicious_request = makeGraphqlRequest(suspicious_query,variables)

##The variable suspicious_request will contain the resulting data from the query.
results = suspicious_request['data']['proxy']['suspicious']

Pandas Dataframes

The following cell loads the results into a pandas dataframe

For more information on how to use pandas, you can learn more here: https://pandas.pydata.org/pandas-docs/stable/10min.html


In [ ]:
df = pd.read_json(json.dumps(results))
##Printing only the selected column list from the dataframe
##Unless specified otherwise, 
print df[['clientIp','uriQuery','datetime','clientToServerBytes','serverToClientBytes', 'host']]

Additional operations

Additional operations can be performed on the dataframe like sorting the data, filtering it and grouping it

Filtering the data


In [ ]:
##Filter results where the destination port = 3389
##The resulting data will be stored in df2 

df2 = df[df['clientIp'].isin(['10.173.202.136'])]
print df2[['clientIp','uriQuery','datetime','host']]

Ordering the data


In [ ]:
srtd = df.sort_values(by="host")
print srtd[['host','clientIp','uriQuery','datetime']]

Grouping the data


In [ ]:
## This command will group the results by pairs of source-destination IP
## summarizing all other columns 
grpd = df.groupby(['clientIp','host']).sum()
## This will print the resulting dataframe displaying the input and output bytes columnns
print grpd[["clientToServerBytes","serverToClientBytes"]]

Reset Scored Connections

Uncomment and execute the following cell to reset all scored connections for this day


In [ ]:
# reset_scores = """mutation($date:SpotDateType!) {
#                   proxy{
#                       resetScoredConnections(date:$date){
#                       success
#                       }
#                   }
#               }"""


# variables={
#     'date': datetime.datetime.strptime(date, '%Y%m%d').strftime('%Y-%m-%d')
#     }
 
# request = makeGraphqlRequest(reset_scores,variables)


# print request['data']['proxy']['resetScoredConnections']['success']

Sandbox

At this point you can perform your own analysis using the previously provided functions as a guide.

Happy threat hunting!


In [ ]:
#Your code here