GRR Colab


In [0]:
%load_ext grr_colab.ipython_extension

In [0]:
import grr_colab

Specifying GRR Colab flags:


In [0]:
grr_colab.flags.FLAGS.set_default('grr_http_api_endpoint', 'http://localhost:8000/')
grr_colab.flags.FLAGS.set_default('grr_admin_ui_url', 'http://localhost:8000/')
grr_colab.flags.FLAGS.set_default('grr_auth_api_user', 'admin')
grr_colab.flags.FLAGS.set_default('grr_auth_password', 'admin')

Magics API

GRR magics allow to search for clients and then to choose a single client to work with. The results of magics are represented as pandas dataframes unless they are primitives.

Searching clients

You can search for clients by specifying username, hostname, client labels etc. The results are sorted by the last seen column.


In [0]:
df = %grr_search_clients -u admin
df[['online', 'online.pretty', 'client_id', 'last_seen_ago', 'last_seen_at.pretty']]


Out[0]:
online online.pretty client_id last_seen_ago last_seen_at.pretty
0 online 🌕 C.dc3782aeab2c5b4c 0 seconds ago 2019-08-30 09:53:28.039821

There is a shortcut for searching for online only clients directly so that you don't need to filter the dataframe.


In [0]:
df = %grr_search_online_clients -u admin
df[['online', 'online.pretty', 'client_id', 'last_seen_ago', 'last_seen_at.pretty']]


Out[0]:
online online.pretty client_id last_seen_ago last_seen_at.pretty
0 online 🌕 C.dc3782aeab2c5b4c 0 seconds ago 2019-08-30 09:53:38.331647

Every datetime field has two representations: the original one that is microseconds and the pretty one that is pandas timestamp.


In [0]:
df[['last_seen_at', 'last_seen_at.pretty']]


Out[0]:
last_seen_at last_seen_at.pretty
0 1567158818331647 2019-08-30 09:53:38.331647

Setting current clients

To work with a client you need to select a client first. It means that you are able to work only with a single client simultaneously using magic commands (there is no such restriction for Pyhton API). To set a client you need either a hostname (works in case of one client set up for that hostname) or a client ID which you can get from the search clients dataframe.


In [0]:
client_id = df['client_id'][0]
%grr_set_client -c {client_id}

%grr_id


Out[0]:
'C.dc3782aeab2c5b4c'

An attempt to set a client with a hostname that has multiple clients will lead to an exception.

Requesting approvals

If you don't have valid approvals for the selected client, you will get an error while attempting to run a flow on it. You can request an approval with magic commands specifying the reason and list of approvers.


In [0]:
%grr_request_approval -r "For testing" -a admin

This function will not wait until the approval is granted. If you need your code to wait until it's granted, use grr_request_approval_and_wait instead.

Exploring filesystem

In addition to the selected client, working directory is also saved. It means that you can use relative paths instead of absolute. Note that the existence of directories is not checked and you will not get an error if you try to cd into directory that does not exist.

Initially you are in the root directory.


In [0]:
%grr_pwd


Out[0]:
'/'

In [0]:
%grr_cd tmp/foo/bar
%grr_pwd


Out[0]:
'/tmp/foo/bar'

In [0]:
%grr_cd ../baz
%grr_pwd


Out[0]:
'/tmp/foo/baz'

You can ls the current directory and any other directories specified by relative and absolute paths.

Note. The most file-related magics start flows and fetch live data from the client. It means that the client has to be online in order for them to work.


In [0]:
df = %grr_ls
df


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 16877 drwxr-xr-x 17696532 65025 2 585945 89939 4096 1567157599 1567157599 1567157599 8 4096 0 OS /tmp/foo/baz/dir1 CASE_LITERAL 0 524288
1 16877 drwxr-xr-x 17832583 65025 3 585945 89939 4096 1567157734 1567157599 1567157599 8 4096 0 OS /tmp/foo/baz/dir2 CASE_LITERAL 0 524288
2 33188 -rw-r--r-- 17696534 65025 1 585945 89939 70 1567158029 1567157649 1567157649 8 4096 0 OS /tmp/foo/baz/file1 CASE_LITERAL 0 524288
3 33188 -rw-r--r-- 17696533 65025 1 585945 89939 23 1567158209 1567157627 1567157627 8 4096 0 OS /tmp/foo/baz/file2 CASE_LITERAL 0 524288

Stat mode has two representations: number and UNIX-style:


In [0]:
df[['st_mode', 'st_mode.pretty']]


Out[0]:
st_mode st_mode.pretty
0 16877 drwxr-xr-x
1 16877 drwxr-xr-x
2 33188 -rw-r--r--
3 33188 -rw-r--r--

In [0]:
%grr_ls ../baz/dir2


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 16877 drwxr-xr-x 17835392 65025 2 585945 89939 4096 1567157599 1567157599 1567157599 8 4096 0 OS /tmp/foo/baz/dir2/dir3 CASE_LITERAL 0 524288

In [0]:
%grr_ls /tmp/foo


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 16877 drwxr-xr-x 17567410 65025 2 585945 89939 4096 1567157544 1567157544 1567157544 8 4096 0 OS /tmp/foo/bar CASE_LITERAL 0 524288
1 16877 drwxr-xr-x 17695802 65025 4 585945 89939 4096 1567157664 1567157631 1567157631 8 4096 0 OS /tmp/foo/baz CASE_LITERAL 0 524288

To see some metadata of a file you can just call grr_stat function.


In [0]:
%grr_stat file1


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 33188 -rw-r--r-- 17696534 65025 1 585945 89939 70 1567158029 1567157649 1567157649 8 4096 0 OS /tmp/foo/baz/file1 CASE_LITERAL 0 524288

You can use globbing for stat:


In [0]:
%grr_stat "file*"


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 33188 -rw-r--r-- 17696534 65025 1 585945 89939 70 1567158029 1567157649 1567157649 8 4096 0 OS /tmp/foo/baz/file1 CASE_LITERAL 0 524288
1 33188 -rw-r--r-- 17696533 65025 1 585945 89939 23 1567158209 1567157627 1567157627 8 4096 0 OS /tmp/foo/baz/file2 CASE_LITERAL 0 524288

You can print the first bytes of a file:


In [0]:
%grr_head file1 -c 30


Out[0]:
b'This is the first line\nThis is'

Alghough there is no offset in original bash head command you can specify offset in grr_head:


In [0]:
%grr_head file1 -c 30 -o 20


Out[0]:
b'ne\nThis is the second line\nThi'

Some of the functions like grr_head and grr_ls have --cached (-C for short) option which indicates that no calls to the client should be performed. In this case the data will be fetched from the cached data on the server. Server cached data is updated only during calls to the client so it is not always up-to-date but accessing it is way faster.


In [0]:
%grr_ls /tmp/foo/baz -C


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 16877 drwxr-xr-x 17696532 65025 2 585945 89939 4096 1567157599 1567157599 1567157599 8 4096 0 OS /tmp/foo/baz/dir1 CASE_LITERAL 0 524288
1 16877 drwxr-xr-x 17832583 65025 3 585945 89939 4096 1567157734 1567157599 1567157599 8 4096 0 OS /tmp/foo/baz/dir2 CASE_LITERAL 0 524288
2 33188 -rw-r--r-- 17696534 65025 1 585945 89939 70 1567158029 1567157649 1567157649 8 4096 0 OS /tmp/foo/baz/file1 CASE_LITERAL 0 524288
3 33188 -rw-r--r-- 17696533 65025 1 585945 89939 23 1567158209 1567157627 1567157627 8 4096 0 OS /tmp/foo/baz/file2 CASE_LITERAL 0 524288

In [0]:
%grr_head file1 -C


Out[0]:
b'This is the first line\nThis is the second line\nThis is the third LINE\n'

Grepping files is also possible. --fixed-string (-F for short) option indicates that pattern to search for is not a regular expression. --hex-string (-X for short) option allows to pass hex strings as a pattern.


In [0]:
%grr_grep "line" file1


Out[0]:
offset length data data.pretty pathspec.pathtype pathspec.path pathspec.path_options
0 18 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
1 42 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
2 65 4 b'LINE' b'LINE' OS /tmp/foo/baz/file1 CASE_LITERAL

In [0]:
%grr_grep -F "line" file1


Out[0]:
offset length data data.pretty pathspec.pathtype pathspec.path pathspec.path_options
0 18 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
1 42 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL

In [0]:
%grr_grep -X "6c696e65" file1


Out[0]:
offset length data data.pretty pathspec.pathtype pathspec.path pathspec.path_options
0 18 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
1 42 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
2 65 4 b'LINE' b'LINE' OS /tmp/foo/baz/file1 CASE_LITERAL

There is a shortcut for --fixed-strings option. Globbing is also available here.


In [0]:
%grr_fgrep "line" "file*"


Out[0]:
offset length data data.pretty pathspec.pathtype pathspec.path pathspec.path_options
0 18 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
1 42 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
2 18 4 b'line' b'line' OS /tmp/foo/baz/file2 CASE_LITERAL

In [0]:
%grr_fgrep -X "6c696e65" file1


Out[0]:
offset length data data.pretty pathspec.pathtype pathspec.path pathspec.path_options
0 18 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL
1 42 4 b'line' b'line' OS /tmp/foo/baz/file1 CASE_LITERAL

If the file is too large and you'd like to download it then use wget:


In [0]:
%grr_wget file1


Out[0]:
'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

You can also download a cached version:


In [0]:
%grr_wget file1 -C


Out[0]:
'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

You can specify path type with --path-type flag (-P for short) for all filesystem related magics. The available values are os (default), tsk, registry.


In [0]:
%grr_ls -P os -C


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 16877 drwxr-xr-x 17696532 65025 2 585945 89939 4096 1567157599 1567157599 1567157599 8 4096 0 OS /tmp/foo/baz/dir1 CASE_LITERAL 0 524288
1 16877 drwxr-xr-x 17832583 65025 3 585945 89939 4096 1567157734 1567157599 1567157599 8 4096 0 OS /tmp/foo/baz/dir2 CASE_LITERAL 0 524288
2 33188 -rw-r--r-- 17696534 65025 1 585945 89939 70 1567158029 1567157649 1567157649 8 4096 0 OS /tmp/foo/baz/file1 CASE_LITERAL 0 524288
3 33188 -rw-r--r-- 17696533 65025 1 585945 89939 23 1567158209 1567157627 1567157627 8 4096 0 OS /tmp/foo/baz/file2 CASE_LITERAL 0 524288

System information

Names of the functions are the same as in bash for simplicity.

Printing hostname of the client:


In [0]:
%grr_hostname

Getting network interfaces info:


In [0]:
ifaces = %grr_ifconfig

For mac address fields there are also two columns: one with the original bytes type but not representable and pretty one with string representation of mac address.


In [0]:
ifaces[['mac_address', 'mac_address.pretty']][1:]


Out[0]:
mac_address mac_address.pretty
1 b'\x00\x00\x00\x00\x00\x00' 00:00:00:00:00:00

If a field contains a collection then the cell in the dataframe is represented as another dataframe. IP adress fields also have two representations.


In [0]:
ifaces['addresses'][1]


Out[0]:
address_type packed_bytes packed_bytes.pretty
0 INET b'\x7f\x00\x00\x01' 127.0.0.1
1 INET6 b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00... ::1

For uname command only two options are available: --machine that prints the machine architecture and --kernel-release.


In [0]:
%grr_uname -m


Out[0]:
'x86_64'

In [0]:
%grr_uname -r


Out[0]:
'4.19.37-5rodete4-amd64'

To get the client summary you can simply call interrogate flow.


In [0]:
df = %grr_interrogate
df[['client_id', 'system_info.system', 'system_info.machine']]


Out[0]:
client_id system_info.system system_info.machine
0 aff4:/C.dc3782aeab2c5b4c Linux x86_64

There is also possible to get info about processes that are running on client machine:


In [0]:
ps = %grr_ps
ps[:5]


Out[0]:
pid ppid name exe cmdline ctime real_uid effective_uid saved_uid real_gid ... status nice cwd num_threads user_cpu_time system_cpu_time RSS_size VMS_size memory_percent connections
0 1 0 systemd /usr/lib/systemd/systemd 0 0 /lib/systemd/system... 1565017014530000 0 0 0 0 ... sleeping 0 / 1 78.779999 53.02000 9670656 230248448 0.014377 NaN
1 520 1 lvmetad /usr/sbin/lvmetad 0 0 /sbin/lvmetad 1 ... 1565017041170000 0 0 0 0 ... sleeping 0 / 1 0.050000 0.05000 1937408 108138496 0.002880 NaN
2 759 1 rpc.svcgssd /usr/sbin/rpc.svcgssd 0 0 /usr/sbin/rpc.svcgssd 1565017041590000 0 0 0 0 ... sleeping 0 / 1 0.000000 0.00000 3215360 31694848 0.004780 NaN
3 760 1 rpc.gssd /usr/sbin/rpc.gssd 0 0 /usr/sbin/rpc.gssd 1 ... 1565017041600000 0 0 0 0 ... sleeping 0 /run/rpc_pipefs 1 0.000000 0.00000 299008 27766784 0.000445 NaN
4 848 1 mgagentxp_script_runner.par /usr/bin/mgagentxp_script_runner.par ... 1565017042310000 65534 65534 65534 1001 ... sleeping 0 / 5 424.779999 490.51001 25403392 1131827200 0.037767 NaN

5 rows × 24 columns

To fetch some system information you can also use osquery. Osquery tables are also converted to dataframes.


In [0]:
%grr_osqueryi "SELECT pid, name, cmdline, state, nice, threads FROM processes WHERE pid >= 440 and pid < 600;"


Out[0]:
cmdline name nice pid state threads
0 kworker/4:1H-kblockd -20 500 I 1
1 rpciod -20 505 I 1
2 xprtiod -20 506 I 1
3 /sbin/lvmetad -f lvmetad 0 520 S 1

Running YARA for scanning processes is also available.


In [0]:
import os 

pid = os.getpid()
data = "dadasdasdasdjaskdakdaskdakjdkjadkjakjjdsgkngksfkjadsjnfandankjd"
rule = 'rule TextExample {{ strings: $text_string = "{data}" condition: $text_string }}'.format(data=data)

df = %grr_yara '{rule}' -p {pid}
df[['process.pid', 'process.name', 'process.exe']]


Out[0]:
process.pid process.name process.exe
0 63438 python3 /opt/python/3.7/bin/python3.7

Configuring flow timeout

The default flow timeout is 30 seconds. It's time the function waits for a flow to complete. You can configure this timeout with grr_set_flow_timeout specifying number of seconds to wait. For examples, this will set the timeout to a minute:


In [0]:
%grr_set_flow_timeout 60

To tell functions to wait for the flows forever until they are completed:


In [0]:
%grr_set_no_flow_timeout

To set timeout to default value of 30 seconds:


In [0]:
%grr_set_default_flow_timeout

Setting timeout to 0 tells functions not to wait at all and exit immediately after the flow starts.


In [0]:
%grr_set_flow_timeout 0

In case timeout is exceeded (or you set 0 timeout) you will se such error with a link to Admin UI.

Collecting artifacts

You can first list all the artifacts that you can collect:


In [0]:
df = %grr_list_artifacts
df[:2]


Out[0]:
artifact.name artifact.doc artifact.supported_os artifact.labels artifact.urls artifact.sources is_custom error_message dependencies artifact.provides path_dependencies processors artifact.conditions
0 APTSources APT package sources list 0 0 Linux 0 0 Configuration Files ... ... type at... False NaN NaN NaN NaN NaN
1 APTTrustKeys APT trusted keys 0 0 Linux 0 0 Configuration Files ... 0 0 https:... type at... False NaN NaN NaN NaN NaN

To collect an artifact you just need to provide its name:


In [0]:
%grr_collect "DebianVersion"


Out[0]:
st_mode st_mode.pretty st_ino st_dev st_nlink st_uid st_gid st_size st_atime st_mtime st_ctime st_blocks st_blksize st_rdev pathspec.pathtype pathspec.path pathspec.path_options st_flags_osx st_flags_linux
0 33188 -rw-r--r-- 10094787 65025 1 0 0 7 1567107891 1559242439 1559242439 8 4096 0 OS /etc/debian_version CASE_LITERAL 0 524288

Python API

Getting a client

Using Python API you can work with multiple clients simultaneously. You don't need to select a client to work with, instead you simply get a client object.

Use search method to search for clients. You can specify ip, mac, host, version, user, and labels search criteria. As a result you will get a list of client objects so that you can pick one of them to work with.


In [0]:
clients = grr_colab.Client.search(user='admin')
clients


Out[0]:
🌕 C.dc3782aeab2c5b4c @ admin.example.com (0 seconds ago)

In [0]:
clients[0].id


Out[0]:
'C.dc3782aeab2c5b4c'

If you know a client ID or a hostname (in case there is one client installed for this hostname) you can get a client object using one of these values:


In [0]:
client = grr_colab.Client.with_id('C.dc3782aeab2c5b4c')

Client properties

There is a bunch of simple client properties to get some info about the client. Unlike magic API this API returns objects but not dataframes for non-primitive values.

Getting the client ID:


In [0]:
client.id


Out[0]:
'C.dc3782aeab2c5b4c'

Getting the client hostname:


In [0]:
client.hostname

Getting network interfaces info:


In [0]:
client.ifaces[1:]


Out[0]:
lo (MAC: 00:00:00:00:00:00):
    inet 127.0.0.1
    inet6 ::1

In [0]:
client.ifaces[1].ifname


Out[0]:
'lo'

This is a collection of interface objects so you can iterate over it and access interface object fields:


In [0]:
for iface in client.ifaces:
  print(iface.ifname)


enp0s31f6
lo

Getting the knowledge base for the client:

You can also access its fields:


In [0]:
client.knowledgebase
client.knowledgebase.os_release


Out[0]:
'Debian GNU/Linux'

Getting an architecture of a machine that client runs on:


In [0]:
client.arch


Out[0]:
'x86_64'

Getting kernel version string:


In [0]:
client.kernel


Out[0]:
'4.19.37-5rodete4-amd64'

Getting a list of labels that are associated with this client:


In [0]:
client.labels


Out[0]:
[]

First seen and last seen times are saved as datetime objets:


In [0]:
client.first_seen


Out[0]:
datetime.datetime(2019, 8, 15, 11, 34, 17, 656692)

In [0]:
client.last_seen


Out[0]:
datetime.datetime(2019, 8, 30, 10, 5, 49, 102492)

Requesting approvals

As in magics API here you also need to request an approval before running flows on a client. To do this simply call request_approval method providing a reason for the approval and list of approvers.


In [0]:
client.request_approval(approvers=['admin'], reason='Test reason')

This method does not wait until the approval is granted. If you need to wait, use request_approval_and_wait method that has the same signature.

Running flows

To set the flow timeout use set_flow_timeout function. 30 seconds is the default value. 0 means exit immediately after the flow started. You can also reset timeout and set it to a default value of 30 seconds.


In [0]:
# Wait forever
grr_colab.set_no_flow_timeout()

# Exit immediately
grr_colab.set_flow_timeout(0)

# Wait for one minute
grr_colab.set_flow_timeout(60)

#Wait for 30 seconds
grr_colab.set_default_flow_timeout()

Below are examples of flows that you can run.

Interrogating a client:


In [0]:
summary = client.interrogate()
summary.system_info.system


Out[0]:
'Linux'

Listing processes on a client:


In [0]:
ps = client.ps()
ps[:1]


Out[0]:
   PID USER       NI  VIRT   RES S CPU% MEM% Command
     1 root        0  220M    9M S  0.0  0.0 /usr/lib/systemd/systemd

In [0]:
ps[0]


Out[0]:
     1 root        0  220M    9M S  0.0  0.0 /usr/lib/systemd/systemd

In [0]:
ps[0].exe


Out[0]:
'/usr/lib/systemd/systemd'

Listing files in a directory. Here you need to provide the absolute path to the directory because there is no state.


In [0]:
files = client.ls('/tmp/foo/baz')
files


Out[0]:
/tmp/foo/baz
    📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
    📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:
for f in files:
  print(f.pathspec.path)


/tmp/foo/baz/dir1
/tmp/foo/baz/dir2
/tmp/foo/baz/file1
/tmp/foo/baz/file2

Recursive listing of a directory is also possible. To do this specify the max depth of the recursion.


In [0]:
files = client.ls('/tmp/foo', max_depth=3)
files


Out[0]:
/tmp/foo
    📂 bar (drwxr-xr-x /tmp/foo/bar, 4.0 KiB)
    📂 baz (drwxr-xr-x /tmp/foo/baz, 4.0 KiB)
        📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
        📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
            📂 dir3 (drwxr-xr-x /tmp/foo/baz/dir2/dir3, 4.0 KiB)
        📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
        📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:
for f in files:
  print(f.pathspec.path)


/tmp/foo/bar
/tmp/foo/baz
/tmp/foo/baz/dir1
/tmp/foo/baz/dir2
/tmp/foo/baz/file1
/tmp/foo/baz/file2
/tmp/foo/baz/dir2/dir3

Globbing files:


In [0]:
files = client.glob('/tmp/foo/baz/file*')
files


Out[0]:
/tmp/foo/baz
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

Grepping files with regular expressions:


In [0]:
matches = client.grep(path='/tmp/foo/baz/file*', pattern=b'line')
matches


Out[0]:
/tmp/foo/baz/file1:18-22: b'line'
/tmp/foo/baz/file1:42-46: b'line'
/tmp/foo/baz/file1:65-69: b'LINE'
/tmp/foo/baz/file2:18-22: b'line'

In [0]:
for match in matches:
  print(match.pathspec.path, match.offset, match.data)


/tmp/foo/baz/file1 18 b'line'
/tmp/foo/baz/file1 42 b'line'
/tmp/foo/baz/file1 65 b'LINE'
/tmp/foo/baz/file2 18 b'line'

In [0]:
matches = client.grep(path='/tmp/foo/baz/file*', pattern=b'\x6c\x69\x6e\x65')
matches


Out[0]:
/tmp/foo/baz/file1:18-22: b'line'
/tmp/foo/baz/file1:42-46: b'line'
/tmp/foo/baz/file1:65-69: b'LINE'
/tmp/foo/baz/file2:18-22: b'line'

Grepping files by exact match:


In [0]:
matches = client.fgrep(path='/tmp/foo/baz/file*', literal=b'line')
matches


Out[0]:
/tmp/foo/baz/file1:18-22: b'line'
/tmp/foo/baz/file1:42-46: b'line'
/tmp/foo/baz/file2:18-22: b'line'

Downloading files:


In [0]:
client.wget('/tmp/foo/baz/file1')


Out[0]:
'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

Osquerying a client:


In [0]:
table = client.osquery('SELECT pid, name, nice FROM processes WHERE pid < 5')
table


Out[0]:
         name nice pid
0     systemd    0   1
1    kthreadd    0   2
2      rcu_gp  -20   3
3  rcu_par_gp  -20   4

In [0]:
header = ' '.join(str(col.name).rjust(10) for col in table.header.columns)
print(header)
print('-' * len(header))
for row in table.rows:
  print(' '.join(map(lambda _: _.rjust(10), row.values)))


      name       nice        pid
--------------------------------
   systemd          0          1
  kthreadd          0          2
    rcu_gp        -20          3
rcu_par_gp        -20          4

Listing artifacts:


In [0]:
artifacts = grr_colab.list_artifacts()
artifacts[0]


Out[0]:
artifact {
  name: "APTSources"
  doc: "APT package sources list"
  labels: "Configuration Files"
  labels: "System"
  supported_os: "Linux"
  urls: "http://manpages.ubuntu.com/manpages/trusty/en/man5/sources.list.5.html"
  sources {
    type: FILE
    attributes {
      dat {
        k {
          string: "paths"
        }
        v {
          list {
            content {
              string: "/etc/apt/sources.list"
            }
            content {
              string: "/etc/apt/sources.list.d/*.list"
            }
          }
        }
      }
    }
  }
}
is_custom: false
error_message: ""

To collect an artifact you just need to provide its name:


In [0]:
client.collect('DebianVersion')


Out[0]:
[📄 debian_version (-rw-r--r-- /etc/debian_version, 7 Bytes)]

Running YARA:


In [0]:
import os 

pid = os.getpid()
data = "dadasdasdasdjaskdakdaskdakjdkjadkjakjjdsgkngksfkjadsjnfandankjd"
rule = 'rule TextExample {{ strings: $text_string = "{data}" condition: $text_string }}'.format(data=data)

matches = client.yara(rule, pids=[pid])
print(matches[0].process.pid, matches[0].process.name)


63438 python3

Working with files

You can read ans seek files interacting with them like fith usual python files.


In [0]:
with client.open('/tmp/foo/baz/file1') as f:
  print(f.readline())


b'This is the first line\n'

In [0]:
with client.open('/tmp/foo/baz/file1') as f:
  for line in f:
    print(line)


b'This is the first line\n'
b'This is the second line\n'
b'This is the third LINE\n'

In [0]:
with client.open('/tmp/foo/baz/file1') as f:
  print(f.read(22))
  f.seek(0)
  print(f.read(22))
  print(f.read())


b'This is the first line'
b'This is the first line'
b'\nThis is the second line\nThis is the third LINE\n'

Cached data

To fetch server cached data use cached property of a client object.

You can list files in directory (recursively also) and read and dowload files as above:


In [0]:
files = client.cached.ls('/tmp/foo/baz')
files


Out[0]:
/tmp/foo/baz
    📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
    📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:
files = client.cached.ls('/tmp/foo/baz', max_depth=2)
files


Out[0]:
/tmp/foo/baz
    📂 dir1 (drwxr-xr-x /tmp/foo/baz/dir1, 4.0 KiB)
    📂 dir2 (drwxr-xr-x /tmp/foo/baz/dir2, 4.0 KiB)
        📂 dir3 (drwxr-xr-x /tmp/foo/baz/dir2/dir3, 4.0 KiB)
    📄 file1 (-rw-r--r-- /tmp/foo/baz/file1, 70 Bytes)
    📄 file2 (-rw-r--r-- /tmp/foo/baz/file2, 23 Bytes)

In [0]:
with client.cached.open('/tmp/foo/baz/file1') as f:
  for line in f:
    print(line)


b'This is the first line\n'
b'This is the second line\n'
b'This is the third LINE\n'

In [0]:
client.cached.wget('/tmp/foo/baz/file1')


Out[0]:
'http://localhost:8000//api/clients/C.dc3782aeab2c5b4c/vfs-blob/fs/os/tmp/foo/baz/file1'

You can also refresh filesystem metadata that is cached on the server by calling refresh method (that will refresh the contents of the directory and not its subdirectories):


In [0]:
client.cached.refresh('/tmp/foo/baz')

To refresh a directory recursively specify max_depth parameter:


In [0]:
client.cached.refresh('/tmp/foo/baz', max_depth=2)

In [0]:
### Path types

To specify path type, just use one of the client properties: client.os (the same as just using client), client.tsk, client.registry.


In [0]:
client.os.ls('/tmp/foo')


Out[0]:
/tmp/foo
    📂 bar (drwxr-xr-x /tmp/foo/bar, 4.0 KiB)
    📂 baz (drwxr-xr-x /tmp/foo/baz, 4.0 KiB)

In [0]:
client.os.cached.ls('/tmp/foo')


Out[0]:
/tmp/foo
    📂 bar (drwxr-xr-x /tmp/foo/bar, 4.0 KiB)
    📂 baz (drwxr-xr-x /tmp/foo/baz, 4.0 KiB)