In [0]:
%load_ext grr_colab.ipython_extension
In [0]:
import grr_colab
Specifying GRR Colab flags:
In [0]:
grr_colab.flags.FLAGS.set_default('grr_http_api_endpoint', 'http://localhost:8000/')
grr_colab.flags.FLAGS.set_default('grr_admin_ui_url', 'http://localhost:8000/')
grr_colab.flags.FLAGS.set_default('grr_auth_api_user', 'admin')
grr_colab.flags.FLAGS.set_default('grr_auth_password', 'admin')
GRR magics allow to search for clients and then to choose a single client to work with. The results of magics are represented as pandas dataframes unless they are primitives.
You can search for clients by specifying username, hostname, client labels etc. The results are sorted by the last seen column.
In [0]:
df = %grr_search_clients -u admin
df[['online', 'online.pretty', 'client_id', 'last_seen_ago', 'last_seen_at.pretty']]
Out[0]:
There is a shortcut for searching for online only clients directly so that you don't need to filter the dataframe.
In [0]:
df = %grr_search_online_clients -u admin
df[['online', 'online.pretty', 'client_id', 'last_seen_ago', 'last_seen_at.pretty']]
Out[0]:
Every datetime field has two representations: the original one that is microseconds and the pretty one that is pandas timestamp.
In [0]:
df[['last_seen_at', 'last_seen_at.pretty']]
Out[0]:
To work with a client you need to select a client first. It means that you are able to work only with a single client simultaneously using magic commands (there is no such restriction for Pyhton API). To set a client you need either a hostname (works in case of one client set up for that hostname) or a client ID which you can get from the search clients dataframe.
In [0]:
client_id = df['client_id'][0]
%grr_set_client -c {client_id}
%grr_id
Out[0]:
An attempt to set a client with a hostname that has multiple clients will lead to an exception.
If you don't have valid approvals for the selected client, you will get an error while attempting to run a flow on it. You can request an approval with magic commands specifying the reason and list of approvers.
In [0]:
%grr_request_approval -r "For testing" -a admin
This function will not wait until the approval is granted. If you need your code to wait until it's granted, use grr_request_approval_and_wait
instead.
In addition to the selected client, working directory is also saved. It means that you can use relative paths instead of absolute. Note that the existence of directories is not checked and you will not get an error if you try to cd into directory that does not exist.
Initially you are in the root directory.
In [0]:
%grr_pwd
Out[0]:
In [0]:
%grr_cd tmp/foo/bar
%grr_pwd
Out[0]:
In [0]:
%grr_cd ../baz
%grr_pwd
Out[0]:
You can ls the current directory and any other directories specified by relative and absolute paths.
Note. The most file-related magics start flows and fetch live data from the client. It means that the client has to be online in order for them to work.
In [0]:
df = %grr_ls
df
Out[0]:
Stat mode has two representations: number and UNIX-style:
In [0]:
df[['st_mode', 'st_mode.pretty']]
Out[0]:
In [0]:
%grr_ls ../baz/dir2
Out[0]:
In [0]:
%grr_ls /tmp/foo
Out[0]:
To see some metadata of a file you can just call grr_stat
function.
In [0]:
%grr_stat file1
Out[0]:
You can use globbing for stat:
In [0]:
%grr_stat "file*"
Out[0]:
You can print the first bytes of a file:
In [0]:
%grr_head file1 -c 30
Out[0]:
Alghough there is no offset in original bash head
command you can specify offset in grr_head
:
In [0]:
%grr_head file1 -c 30 -o 20
Out[0]:
Some of the functions like grr_head
and grr_ls
have --cached
(-C
for short) option which indicates that no calls to the client should be performed. In this case the data will be fetched from the cached data on the server. Server cached data is updated only during calls to the client so it is not always up-to-date but accessing it is way faster.
In [0]:
%grr_ls /tmp/foo/baz -C
Out[0]:
In [0]:
%grr_head file1 -C
Out[0]:
Grepping files is also possible. --fixed-string
(-F
for short) option indicates that pattern to search for is not a regular expression. --hex-string
(-X
for short) option allows to pass hex strings as a pattern.
In [0]:
%grr_grep "line" file1
Out[0]:
In [0]:
%grr_grep -F "line" file1
Out[0]:
In [0]:
%grr_grep -X "6c696e65" file1
Out[0]:
There is a shortcut for --fixed-strings
option. Globbing is also available here.
In [0]:
%grr_fgrep "line" "file*"
Out[0]:
In [0]:
%grr_fgrep -X "6c696e65" file1
Out[0]:
If the file is too large and you'd like to download it then use wget
:
In [0]:
%grr_wget file1
Out[0]:
You can also download a cached version:
In [0]:
%grr_wget file1 -C
Out[0]:
You can specify path type with --path-type
flag (-P
for short) for all filesystem related magics. The available values are os
(default), tsk
, registry
.
In [0]:
%grr_ls -P os -C
Out[0]:
Names of the functions are the same as in bash for simplicity.
Printing hostname of the client:
In [0]:
%grr_hostname
Getting network interfaces info:
In [0]:
ifaces = %grr_ifconfig
For mac address fields there are also two columns: one with the original bytes type but not representable and pretty one with string representation of mac address.
In [0]:
ifaces[['mac_address', 'mac_address.pretty']][1:]
Out[0]:
If a field contains a collection then the cell in the dataframe is represented as another dataframe. IP adress fields also have two representations.
In [0]:
ifaces['addresses'][1]
Out[0]:
For uname
command only two options are available: --machine
that prints the machine architecture and --kernel-release
.
In [0]:
%grr_uname -m
Out[0]:
In [0]:
%grr_uname -r
Out[0]:
To get the client summary you can simply call interrogate flow.
In [0]:
df = %grr_interrogate
df[['client_id', 'system_info.system', 'system_info.machine']]
Out[0]:
There is also possible to get info about processes that are running on client machine:
In [0]:
ps = %grr_ps
ps[:5]
Out[0]:
To fetch some system information you can also use osquery. Osquery tables are also converted to dataframes.
In [0]:
%grr_osqueryi "SELECT pid, name, cmdline, state, nice, threads FROM processes WHERE pid >= 440 and pid < 600;"
Out[0]:
Running YARA for scanning processes is also available.
In [0]:
import os
pid = os.getpid()
data = "dadasdasdasdjaskdakdaskdakjdkjadkjakjjdsgkngksfkjadsjnfandankjd"
rule = 'rule TextExample {{ strings: $text_string = "{data}" condition: $text_string }}'.format(data=data)
df = %grr_yara '{rule}' -p {pid}
df[['process.pid', 'process.name', 'process.exe']]
Out[0]:
The default flow timeout is 30 seconds. It's time the function waits for a flow to complete. You can configure this timeout with grr_set_flow_timeout
specifying number of seconds to wait. For examples, this will set the timeout to a minute:
In [0]:
%grr_set_flow_timeout 60
To tell functions to wait for the flows forever until they are completed:
In [0]:
%grr_set_no_flow_timeout
To set timeout to default value of 30 seconds:
In [0]:
%grr_set_default_flow_timeout
Setting timeout to 0 tells functions not to wait at all and exit immediately after the flow starts.
In [0]:
%grr_set_flow_timeout 0
In case timeout is exceeded (or you set 0 timeout) you will se such error with a link to Admin UI.
You can first list all the artifacts that you can collect:
In [0]:
df = %grr_list_artifacts
df[:2]
Out[0]:
To collect an artifact you just need to provide its name:
In [0]:
%grr_collect "DebianVersion"
Out[0]:
Using Python API you can work with multiple clients simultaneously. You don't need to select a client to work with, instead you simply get a client object.
Use search
method to search for clients. You can specify ip
, mac
, host
, version
, user
, and labels
search criteria. As a result you will get a list of client objects so that you can pick one of them to work with.
In [0]:
clients = grr_colab.Client.search(user='admin')
clients
Out[0]:
In [0]:
clients[0].id
Out[0]:
If you know a client ID or a hostname (in case there is one client installed for this hostname) you can get a client object using one of these values:
In [0]:
client = grr_colab.Client.with_id('C.dc3782aeab2c5b4c')
There is a bunch of simple client properties to get some info about the client. Unlike magic API this API returns objects but not dataframes for non-primitive values.
Getting the client ID:
In [0]:
client.id
Out[0]:
Getting the client hostname:
In [0]:
client.hostname
Getting network interfaces info:
In [0]:
client.ifaces[1:]
Out[0]:
In [0]:
client.ifaces[1].ifname
Out[0]:
This is a collection of interface objects so you can iterate over it and access interface object fields:
In [0]:
for iface in client.ifaces:
print(iface.ifname)
Getting the knowledge base for the client:
You can also access its fields:
In [0]:
client.knowledgebase
client.knowledgebase.os_release
Out[0]:
Getting an architecture of a machine that client runs on:
In [0]:
client.arch
Out[0]:
Getting kernel version string:
In [0]:
client.kernel
Out[0]:
Getting a list of labels that are associated with this client:
In [0]:
client.labels
Out[0]:
First seen and last seen times are saved as datetime objets:
In [0]:
client.first_seen
Out[0]:
In [0]:
client.last_seen
Out[0]:
As in magics API here you also need to request an approval before running flows on a client. To do this simply call request_approval
method providing a reason for the approval and list of approvers.
In [0]:
client.request_approval(approvers=['admin'], reason='Test reason')
This method does not wait until the approval is granted. If you need to wait, use request_approval_and_wait
method that has the same signature.
To set the flow timeout use set_flow_timeout
function. 30 seconds is the default value. 0 means exit immediately after the flow started. You can also reset timeout and set it to a default value of 30 seconds.
In [0]:
# Wait forever
grr_colab.set_no_flow_timeout()
# Exit immediately
grr_colab.set_flow_timeout(0)
# Wait for one minute
grr_colab.set_flow_timeout(60)
#Wait for 30 seconds
grr_colab.set_default_flow_timeout()
Below are examples of flows that you can run.
Interrogating a client:
In [0]:
summary = client.interrogate()
summary.system_info.system
Out[0]:
Listing processes on a client:
In [0]:
ps = client.ps()
ps[:1]
Out[0]:
In [0]:
ps[0]
Out[0]:
In [0]:
ps[0].exe
Out[0]:
Listing files in a directory. Here you need to provide the absolute path to the directory because there is no state.
In [0]:
files = client.ls('/tmp/foo/baz')
files
Out[0]:
In [0]:
for f in files:
print(f.pathspec.path)
Recursive listing of a directory is also possible. To do this specify the max depth of the recursion.
In [0]:
files = client.ls('/tmp/foo', max_depth=3)
files
Out[0]:
In [0]:
for f in files:
print(f.pathspec.path)
Globbing files:
In [0]:
files = client.glob('/tmp/foo/baz/file*')
files
Out[0]:
Grepping files with regular expressions:
In [0]:
matches = client.grep(path='/tmp/foo/baz/file*', pattern=b'line')
matches
Out[0]:
In [0]:
for match in matches:
print(match.pathspec.path, match.offset, match.data)
In [0]:
matches = client.grep(path='/tmp/foo/baz/file*', pattern=b'\x6c\x69\x6e\x65')
matches
Out[0]:
Grepping files by exact match:
In [0]:
matches = client.fgrep(path='/tmp/foo/baz/file*', literal=b'line')
matches
Out[0]:
Downloading files:
In [0]:
client.wget('/tmp/foo/baz/file1')
Out[0]:
Osquerying a client:
In [0]:
table = client.osquery('SELECT pid, name, nice FROM processes WHERE pid < 5')
table
Out[0]:
In [0]:
header = ' '.join(str(col.name).rjust(10) for col in table.header.columns)
print(header)
print('-' * len(header))
for row in table.rows:
print(' '.join(map(lambda _: _.rjust(10), row.values)))
Listing artifacts:
In [0]:
artifacts = grr_colab.list_artifacts()
artifacts[0]
Out[0]:
To collect an artifact you just need to provide its name:
In [0]:
client.collect('DebianVersion')
Out[0]:
Running YARA:
In [0]:
import os
pid = os.getpid()
data = "dadasdasdasdjaskdakdaskdakjdkjadkjakjjdsgkngksfkjadsjnfandankjd"
rule = 'rule TextExample {{ strings: $text_string = "{data}" condition: $text_string }}'.format(data=data)
matches = client.yara(rule, pids=[pid])
print(matches[0].process.pid, matches[0].process.name)
You can read ans seek files interacting with them like fith usual python files.
In [0]:
with client.open('/tmp/foo/baz/file1') as f:
print(f.readline())
In [0]:
with client.open('/tmp/foo/baz/file1') as f:
for line in f:
print(line)
In [0]:
with client.open('/tmp/foo/baz/file1') as f:
print(f.read(22))
f.seek(0)
print(f.read(22))
print(f.read())
To fetch server cached data use cached
property of a client object.
You can list files in directory (recursively also) and read and dowload files as above:
In [0]:
files = client.cached.ls('/tmp/foo/baz')
files
Out[0]:
In [0]:
files = client.cached.ls('/tmp/foo/baz', max_depth=2)
files
Out[0]:
In [0]:
with client.cached.open('/tmp/foo/baz/file1') as f:
for line in f:
print(line)
In [0]:
client.cached.wget('/tmp/foo/baz/file1')
Out[0]:
You can also refresh filesystem metadata that is cached on the server by calling refresh
method (that will refresh the contents of the directory and not its subdirectories):
In [0]:
client.cached.refresh('/tmp/foo/baz')
To refresh a directory recursively specify max_depth
parameter:
In [0]:
client.cached.refresh('/tmp/foo/baz', max_depth=2)
In [0]:
### Path types
To specify path type, just use one of the client properties: client.os
(the same as just using client
), client.tsk
, client.registry
.
In [0]:
client.os.ls('/tmp/foo')
Out[0]:
In [0]:
client.os.cached.ls('/tmp/foo')
Out[0]: