WIP: DriveBy PCAP Analysis with Workbench:

Context: The company's Security Information and Event Management (SIEM) system is giving some DNS alerts coming from 'that' guy's computer. They look like that might be DGA (Dynamic Generation Algorithm) based domains so we pull some pcaps to try and quickly find out what's going on.

Please see DGA Notebook for more info on DGAs.

Note: This notebook was inspired by the data_hacking notebook called DriveBy PCAP Analysis. Here we're leveraging the workbench server and focusing the notebook at looking at a specific case captured by ThreatGlass. The exploited website for this exercise is kitchboss.com.au ThreatGlass_Info_for_Kitchenboss (arbitrarily choosen).

Tools in this Notebook:

Workbench: Open Source Security Framework Workbench GitHub
Bro Network Security Monitor (http://www.bro.org)
Pandas: Python Data Analysis Library (http://pandas.pydata.org)
Matplotlib: Python 2D plotting library (http://matplotlib.org)

Workbench can be setup to utilize several indexers:

Straight up Indexing with ElasticSearch
Super awesome Neo4j as both an indexer and graph database.

Neo4j also incorporates Lucene based indexing so not only can we capture a rich set of relationships between our data entities but searches and queries are super quick.

Thanks:

Thanks to Eric Chavez (SeaDawg) for patiently telling me how computers work. Super shout out... :)
Thanks to ThreatGlass for providing a great service and website.
Thanks to Mike Sconzo for putting together the original notebook.

Lets start up the workbench server...

Run the workbench server (from somewhere, for the demo we're just going to start a local one)

$ workbench_server

Okay so when the server starts up, it autoloads any worker plugins in the server/worker directory and dynamically monitors the directory, if a new python file shows up, it's validated as a properly formed plugin and if it passes is added to the list of workers.



In [10]:

    
# Lets start to interact with workbench, please note there is NO specific client to workbench,
# Just use the ZeroRPC Python, Node.js, or CLI interfaces.
import zerorpc
c = zerorpc.Client()
c.connect("tcp://127.0.0.1:4242")









    Out[10]:





[None]

Read in the Data

The data is pulled from [ThreatGlass](http://www.threatglass.com/), the exploited website for this exercise is kitchboss.com.au [ThreatGlass_Info_for_Kitchenboss](http://www.threatglass.com/malicious_urls/60d4098703770bd93c70dbb2f74ba1fb?process_date=2014-04-09)



In [11]:

    
# Load in the PCAP file
filename = '../data/pcap/kitchen_boss.pcap'
with open(filename,'rb') as f:
    pcap_md5 = c.store_sample(f.read(), filename, 'pcap')

Run the PCAP through Bro

Workbench makes running Bro super easy, it manages the PCAPs and the resulting Bro logs. Perhaps most importantly it allows us to pull back specific logs with super awesome client/server generators for the data that we want to analyze.

Note: we only have one in this case but running sets of PCAPs through Bro is well supported, please see our Batches_and_Sets notebook for more infomation.



In [13]:

    
# Run the Bro Network Security Monitor on the PCAP (or set of PCAPS) we just loaded.
# We could run several requests here... 'pcap_bro','view_pcap' or 'view_pcap_details',
# workbench is super granular and it's easy to try them all and add your own as well
output = c.work_request('view_pcap_details', pcap_md5)['view_pcap_details']
output









    Out[13]:





{'bro_logs': {'conn_log': 'bd20b5e153d44bf43f95838540230200',
  'dhcp_log': '55e8aead94d8edf48e636e39c65e846e',
  'dns_log': '14873855a02c7e3952bf75cc2cf74d20',
  'files_log': 'a3e84dcbfecb881ad81123fe109bf9e8',
  'http_log': 'e58c1abb0182e9c48c4e5e37e68fa875',
  'packet_filter_log': 'c89d73c9ff2a12b66cdf74007af8a952',
  'weird_log': 'e872ecdb518caf4dcddc1044b90962c5'},
 'connectionId': 130,
 'err': None,
 'extracted_files': [{'entropy': 7.889679107924797,
   'file_size': 18643,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'c762b6ba4f560692b6b84ac212cd3ec2',
   'sha256': 'c776c5f3b979233c8466fc521e38271bbd59081538e126273fe1a75a228bd25d',
   'ssdeep': '384:7SXliKrIvBZFzoceSZNZ2wLk588eHYBAXGIsMeV:AiKrKySZNxg5892SGII'},
  {'entropy': 7.515107193655836,
   'file_size': 273920,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': '4410133f571476f2e76e29e61767b557',
   'sha256': 'e4bbdc8f869502183293797f51d6d64cc6c49d39b82effbcb738abe511054b51',
   'ssdeep': '6144:RSxqC+ayi6eWLj622ARbJFMQzynbJDxL3oPlRa:oxqC+ayi6p6EmQz+bf3otA'},
  {'entropy': 7.889679107924797,
   'file_size': 18643,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'c762b6ba4f560692b6b84ac212cd3ec2',
   'sha256': 'c776c5f3b979233c8466fc521e38271bbd59081538e126273fe1a75a228bd25d',
   'ssdeep': '384:7SXliKrIvBZFzoceSZNZ2wLk588eHYBAXGIsMeV:AiKrKySZNxg5892SGII'},
  {'entropy': 7.9751436379093406,
   'file_size': 7724,
   'file_type': 'Macromedia Flash data (compressed), version 9',
   'md5': '16cf037b8c8caad6759afc8c309de0f9',
   'sha256': '8af130ffe1140894895225e265d5d9a753ad9d85883db742c88f13bae01e2c30',
   'ssdeep': '192:v27CdZAdM5nmNvipKRRkLozlrEKcgTtpEV3+5jxChbNT3:v27QZAAVpKRRkLu/cOEa9ORD'},
  {'entropy': 7.9751436379093406,
   'file_size': 7724,
   'file_type': 'Macromedia Flash data (compressed), version 9',
   'md5': '16cf037b8c8caad6759afc8c309de0f9',
   'sha256': '8af130ffe1140894895225e265d5d9a753ad9d85883db742c88f13bae01e2c30',
   'ssdeep': '192:v27CdZAdM5nmNvipKRRkLozlrEKcgTtpEV3+5jxChbNT3:v27QZAAVpKRRkLu/cOEa9ORD'},
  {'entropy': 7.889679107924797,
   'file_size': 18643,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'c762b6ba4f560692b6b84ac212cd3ec2',
   'sha256': 'c776c5f3b979233c8466fc521e38271bbd59081538e126273fe1a75a228bd25d',
   'ssdeep': '384:7SXliKrIvBZFzoceSZNZ2wLk588eHYBAXGIsMeV:AiKrKySZNxg5892SGII'},
  {'entropy': 7.889679107924797,
   'file_size': 18643,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'c762b6ba4f560692b6b84ac212cd3ec2',
   'sha256': 'c776c5f3b979233c8466fc521e38271bbd59081538e126273fe1a75a228bd25d',
   'ssdeep': '384:7SXliKrIvBZFzoceSZNZ2wLk588eHYBAXGIsMeV:AiKrKySZNxg5892SGII'},
  {'entropy': 7.889679107924797,
   'file_size': 18643,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'c762b6ba4f560692b6b84ac212cd3ec2',
   'sha256': 'c776c5f3b979233c8466fc521e38271bbd59081538e126273fe1a75a228bd25d',
   'ssdeep': '384:7SXliKrIvBZFzoceSZNZ2wLk588eHYBAXGIsMeV:AiKrKySZNxg5892SGII'},
  {'entropy': 7.284164074455011,
   'file_size': 13006,
   'file_type': 'PDF document, version 1.6',
   'md5': '40b8c3c98f50e078251ec272620dfb5b',
   'sha256': '5dac98c66c7440992de9de860dd4312790bcc7bbcbb4aebfdce8b3fdfcfc56af',
   'ssdeep': '384:jW3+XiQMTXgazaO/J9m1DfN6abiiYEVB0T0DM:Ph+fJSE2r0j'}],
 'md5': 'df4f4a2e2bf020be50a12554942edb88',
 'n': 1,
 'ok': 1.0,
 'updatedExisting': False}

Workbench default mode is to run Bro with file extraction.

We can see that there's quite a few embedded files in this PCAP.

5 Java JAR files (all the same)
2 Flash files (same)
1 PE executable file
1 PDF file



In [8]:

    
# We'll grab the md5s for those files and do some kewl stuff with them later
file_md5s = list(set([item['md5'] for item in output['extracted_files']]))
pe_md5 = '4410133f571476f2e76e29e61767b557'
file_md5s









    Out[8]:





['16cf037b8c8caad6759afc8c309de0f9',
 '40b8c3c98f50e078251ec272620dfb5b',
 'c762b6ba4f560692b6b84ac212cd3ec2',
 '4410133f571476f2e76e29e61767b557']



In [9]:

    
# Grab the Bro logs that we want
dns_log = c.stream_sample(output['bro_logs']['dns_log'])
http_log = c.stream_sample(output['bro_logs']['http_log'])
files_log = c.stream_sample(output['bro_logs']['files_log'])
dns_log









    Out[9]:





<generator object iterator at 0x107e46d70>

Yep they are generators

Yes generators are awesome but getting one from a server request! Are u serious?! Yes, dead serious.. like chopping off your head and kicking your body into a shallow grave and putting your head on a stick... serious.

For more on client/server generators and client-contructed/server-executed generator pipelines see our super spiffy Generator Pipelines notebook.

Now that we have a server generator from workbench we can push it into a Pandas Dataframe without a copy, fast and memory efficient...

Data Transformation: One line of code to put workbench output into a Pandas Dataframe!

Putting our data into a Pandas Dataframe opens up a new world and enables tons of functionality for data, temporal and statistical analysis.

df = pd.DataFrame(output)



In [10]:

    
import pandas as pd

# Okay take the generators returned by stream_sample and efficiently create dataframes
# LIKE BUTTER I TELL YOU!
dns_df = pd.DataFrame(dns_log)
http_df = pd.DataFrame(http_log)
files_df = pd.DataFrame(files_log)
files_df.head()









    Out[10]:






  
    
      
      analyzers
      conn_uids
      depth
      duration
      extracted
      filename
      fuid
      is_orig
      local_orig
      md5
      mime_type
      missing_bytes
      overflow_bytes
      parent_fuid
      rx_hosts
      seen_bytes
      sha1
      sha256
      source
      timedout
      
    
  
  
    
      0
       SHA256,MD5,SHA1
       CBsm7k2L1ReG6MzFbj
       0
       0.168808
       -
       -
        FL607G2jztRHr8Xbz
       F
       -
       bfd039047ebd33f25ffe16d7832d6ceb
        text/html
       0
       0
       -
       192.168.22.10
       13944
       29fba8185043e6c3748ce6c85f9b1b70fe9c4326
       e7517bd0b1654151b5e7284bb8e8b9ec49ce411ff5e1f1...
       HTTP
       F
      ...
    
    
      1
       SHA256,MD5,SHA1
       Cz5NkR1gnqOMWUSbB8
       0
       0.000000
       -
       -
       FH1gD44gUcIfJgq2J2
       F
       -
       14625ee5228c694cf0767e09d12a8d1f
       text/plain
       0
       0
       -
       192.168.22.10
          98
       5676b357553c6d55f3361dbfab460e3268cb3b55
       dfaa8766fad53785e137643e5c685926338f274ec21508...
       HTTP
       F
      ...
    
    
      2
       SHA256,MD5,SHA1
        CnHfPx8bQQg2aeWo5
       0
       0.000056
       -
       -
        FC8A2X2Jla1xMWhr8
       F
       -
       892a543f3abb54e8ec1ada55be3b0649
       text/plain
       0
       0
       -
       192.168.22.10
       10220
       5847ed101f55d51c53538a7078971e7de8fb6762
       8677971b119ccdb82af697ff0e08f218490d15116f221d...
       HTTP
       F
      ...
    
    
      3
       SHA256,MD5,SHA1
       CQUPQx2b0CMrWv1Dag
       0
       0.000000
       -
       -
       F3fwSC4WmD0PKQ6HHg
       F
       -
       0ce8f355891c26c28f057e195e97dcd5
        text/html
       0
       0
       -
       192.168.22.10
        2429
       3c7b369485cadd585d24be44701e459c8aa54d60
       8c7a9c0470563367ab00307b4fb9bb3052d0a27f0b94e6...
       HTTP
       F
      ...
    
    
      4
       SHA256,MD5,SHA1
       CnB5Oj1YktY5UFZu3k
       0
       0.000047
       -
       -
       FLK1KW2a1Bwh6FE6hg
       F
       -
       641cad6161527eb7cdabd4485637634e
        text/html
       0
       0
       -
       192.168.22.10
        4022
       4bc9306998175f909b167734dad41bd5a6589c82
       97b0566bfad0e84bc0eb0db538e66b5dc103a878eb142e...
       HTTP
       F
      ...
    
  

5 rows × 23 columns

So I'm confused what just happened?

We sent a PCAP file to the workbench server
Workbench stores it in a database (MongoDB now, maybe Vertica later)
We made a work request 'view_pcap_details' on the PCAP (runs Bro and other stuff)
We pulled back just the parts of the Bro output that we specifically wanted
We got a set of client/server generators, we populated Pandas dataframes
In like a dozen lines of python!

Image below shows the Workbench database, each worker stores data in a separate collection. The data is transparent, organized and accessible

Lets look at the DNS Data

Given that this exercise started because the SIEM was flagging DNS traffic, we start there. Now that our Bro log data has been streamed into a Pandas Dataframe we can do all kinds of wonderful things. See Pandas to behold the awesome.



In [11]:

    
dns_df.head()









    Out[11]:






  
    
      
      AA
      RA
      RD
      TC
      TTLs
      Z
      answers
      id.orig_h
      id.orig_p
      id.resp_h
      id.resp_p
      proto
      qclass
      qclass_name
      qtype
      qtype_name
      query
      rcode
      rcode_name
      rejected
      
    
  
  
    
      0
       F
       T
       T
       F
                           14372
       0
                                         111.223.225.83
       192.168.22.10
       1035
       4.2.2.3
       53
       udp
       1
       C_INTERNET
       1
       A
             kitchenboss.com.au
       0
       NOERROR
       F
      ...
    
    
      1
       F
       T
       T
       F
       14371.000000,14371.000000
       0
                      kitchenboss.com.au,111.223.225.83
       192.168.22.10
       1035
       4.2.2.3
       53
       udp
       1
       C_INTERNET
       1
       A
         www.kitchenboss.com.au
       0
       NOERROR
       F
      ...
    
    
      2
       F
       T
       T
       F
          2601.000000,128.000000
       0
                  googleapis.l.google.com,74.125.128.95
       192.168.22.10
       1035
       4.2.2.3
       53
       udp
       1
       C_INTERNET
       1
       A
           fonts.googleapis.com
       0
       NOERROR
       F
      ...
    
    
      3
       F
       T
       T
       F
          2587.000000,128.000000
       0
                  googleapis.l.google.com,74.125.128.95
       192.168.22.10
       1035
       4.2.2.3
       53
       udp
       1
       C_INTERNET
       1
       A
            ajax.googleapis.com
       0
       NOERROR
       F
      ...
    
    
      4
       F
       T
       T
       F
         36507.000000,271.000000
       0
       googlecode.l.googleusercontent.com,74.125.128.82
       192.168.22.10
       1042
       4.2.2.3
       53
       udp
       1
       C_INTERNET
       1
       A
       html5shim.googlecode.com
       0
       NOERROR
       F
      ...
    
  

5 rows × 23 columns



In [61]:

    
dns_df[['query','answers','qtype_name']]









    Out[61]:






  
    
      
      query
      answers
      qtype_name
    
    
      time
      
      
      
    
  
  
    
      2014-04-09 06:38:54.505653
                         kitchenboss.com.au
                                          111.223.225.83
         A
    
    
      2014-04-09 06:38:55.370111
                     www.kitchenboss.com.au
                       kitchenboss.com.au,111.223.225.83
         A
    
    
      2014-04-09 06:38:56.773263
                       fonts.googleapis.com
                   googleapis.l.google.com,74.125.128.95
         A
    
    
      2014-04-09 06:38:56.814285
                        ajax.googleapis.com
                   googleapis.l.google.com,74.125.128.95
         A
    
    
      2014-04-09 06:38:56.816699
                   html5shim.googlecode.com
        googlecode.l.googleusercontent.com,74.125.128.82
         A
    
    
      2014-04-09 06:38:58.021865
               themes.googleusercontent.com
       googlehosted.l.googleusercontent.com,173.194.1...
         A
    
    
      2014-04-09 06:38:58.874743
                   www.google-analytics.com
       www-google-analytics.l.google.com,74.125.128.1...
         A
    
    
      2014-04-09 06:39:00.805155
                 fpdownload2.macromedia.com
       fpdownload2.wip4.adobe.com,fpdownload.macromed...
         A
    
    
      2014-04-09 06:39:00.709656
                        advertdedicated.com
                                          217.12.199.174
         A
    
    
      2014-04-09 06:39:02.607899
       p22x62n0yr63872e-qh6.focondteavrt.ru
                                            142.4.194.92
         A
    
    
      2014-04-09 06:39:07.436915
               2496128308-6.focondteavrt.ru
                                            142.4.194.92
         A
    
    
      2014-04-09 06:39:10.103892
                  92.194.4.142.in-addr.arpa
                                                       -
       PTR
    
  

12 rows × 3 columns

Two things make us nervous about this list

They RU domains look like they might be DGA domains DGA Notebook
Also it's a bit weird that the PTR queries didn't come back. Wikipedia: Forward_Confirmed_reverse_DNS has the following: 'Common DNS misconfigurations are outlined in RFC 1912, of particular note is section 2.1 that states, under the heading "Inconsistent, Missing or Bad Data", "Make sure your PTR and A records match".'

Lets look at the HTTP Data

We want to look at the HTTP data to see what kind of data was transferred from the domains of interest. Now that our Bro log data has been streamed into a Pandas Dataframe we can do all kinds of wonderful things. See Pandas to behold the awesome.



In [27]:

    
# Now we group by host and show the different response mime types for each host
group_host = http_df.groupby(['host','uid','resp_mime_types','uri'])[['response_body_len']].sum()
group_host.head(10)









    Out[27]:






  
    
      
      
      
      
      response_body_len
    
    
      host
      uid
      resp_mime_types
      uri
      
    
  
  
    
      2496128308-6.focondteavrt.ru
      C430GXDwsHcaJQOJa
      -
      /f/1397004360/2/2
            0
    
    
      application/jar
      /1397004360.jar
        37286
    
    
      application/x-dosexec
      /f/1397004360/2
       273920
    
    
      CqcNvW15fd2wz3n943
      application/jar
      /1397004360.jar
        55929
    
    
      Csbo4X2kd5yinra0Df
      application/pdf
      /1397004360.pdf
        13006
    
    
      advertdedicated.com
      CAPqz5JkJVfWC7h6
      text/plain
      /jQuery.js?id=AJAX&PID=1i&cache=91938.89965358726
        13160
    
    
      ajax.googleapis.com
      C0jJNB2DFc48BN6IUe
      text/plain
      /ajax/libs/swfobject/2.2/swfobject.js
        10220
    
    
      C2Oaxl1r99xWV2Du
      -
      /ajax/libs/jquery/1.8.3/jquery.min.js
            0
    
    
      CKE2fs4wR6cxeEpx2g
      text/plain
      /ajax/libs/jquery/1.8.3/jquery.min.js
        93637
    
    
      CecMiUQirdvhCW0ja
      -
      /ajax/libs/swfobject/2.2/swfobject.js
            0
    
  

10 rows × 1 columns

Okay so now that we're looking at the http data we're not feeling any better

The grouped dataframe above shows clearly that four files were downloaded from the '2496128308-6.focondteavrt.ru' host.
A JAR file (twice), a PE file and a PDF file. In fact the JAR/PE file combination was pulled in the same http connection (highly suspicious).
Also the URIs look super odd, they all match like they are part of some 'attack bundle'.

Now lets look at the Files Data

Okay last but certainly not least we want to do a deep dive into the files that were downloaded to the computer, so we pull out the list of md5 that we saved earlier in the notebook from the workbench 'view_pcap_details' output. Note that the batch request below also returns a client/server generator (see Generator_Pipelines Notebook for more information).

Note: Workbench is in desperate need of workers for PDF, SWF, and JAR files. If you'd like to contribute please contact briford@supercowpowers.com :)

For now we'll just look at some Virus Total results and take a quick peek at the SWF and PE files.



In [28]:

    
# Get Meta-data for each of the extracted files from the PCAP
file_views = c.batch_work_request('meta_deep',{'md5_list':file_md5s})
[view for view in file_views]









    Out[28]:





[{'encoding': 'binary',
  'entropy': 7.9751436379093406,
  'file_size': 7724,
  'file_type': 'Macromedia Flash data (compressed), version 9',
  'filename': 'HTTP-FhsNCh3n8L16lFIiXe.swf',
  'import_time': '2014-04-15T16:34:48.142000Z',
  'md5': '16cf037b8c8caad6759afc8c309de0f9',
  'mime_type': 'application/x-shockwave-flash',
  'sha1': 'e60071cf2e1460c26449f3a464ef8861043146ed',
  'sha256': '8af130ffe1140894895225e265d5d9a753ad9d85883db742c88f13bae01e2c30',
  'ssdeep': '192:v27CdZAdM5nmNvipKRRkLozlrEKcgTtpEV3+5jxChbNT3:v27QZAAVpKRRkLu/cOEa9ORD',
  'type_tag': 'swf'},
 {'encoding': 'binary',
  'entropy': 7.284164074455011,
  'file_size': 13006,
  'file_type': 'PDF document, version 1.6',
  'filename': 'HTTP-FwHJvw10MLj8tTj9O8.pdf',
  'import_time': '2014-04-15T16:34:48.153000Z',
  'md5': '40b8c3c98f50e078251ec272620dfb5b',
  'mime_type': 'application/pdf',
  'sha1': '1200d521f453fe03596a50cfc401963f3fd15d76',
  'sha256': '5dac98c66c7440992de9de860dd4312790bcc7bbcbb4aebfdce8b3fdfcfc56af',
  'ssdeep': '384:jW3+XiQMTXgazaO/J9m1DfN6abiiYEVB0T0DM:Ph+fJSE2r0j',
  'type_tag': 'pdf'},
 {'encoding': 'binary',
  'entropy': 7.889679107924797,
  'file_size': 18643,
  'file_type': 'Java Jar file data (zip)',
  'filename': 'HTTP-F5XuvS22aJWC2yHiRl.jar',
  'import_time': '2014-04-15T16:34:48.128000Z',
  'md5': 'c762b6ba4f560692b6b84ac212cd3ec2',
  'mime_type': 'application/jar',
  'sha1': '81721697e7d4538137bab8efd7e29b4003694294',
  'sha256': 'c776c5f3b979233c8466fc521e38271bbd59081538e126273fe1a75a228bd25d',
  'ssdeep': '384:7SXliKrIvBZFzoceSZNZ2wLk588eHYBAXGIsMeV:AiKrKySZNxg5892SGII',
  'type_tag': 'jar'},
 {'encoding': 'binary',
  'entropy': 7.515107193655836,
  'file_size': 273920,
  'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
  'filename': 'HTTP-F6daIS1TRjI9X6r873.exe',
  'import_time': '2014-04-15T16:34:48.135000Z',
  'md5': '4410133f571476f2e76e29e61767b557',
  'mime_type': 'application/x-dosexec',
  'sha1': '035db69cc80fc56717a42646911d9aa95b2ff39e',
  'sha256': 'e4bbdc8f869502183293797f51d6d64cc6c49d39b82effbcb738abe511054b51',
  'ssdeep': '6144:RSxqC+ayi6eWLj622ARbJFMQzynbJDxL3oPlRa:oxqC+ayi6p6EmQz+bf3otA',
  'type_tag': 'exe'}]



In [29]:

    
# Virus Total Queries (as of 4-20-2014)
vt_output = c.batch_work_request('vt_query', {'md5_list':file_md5s})
[output for output in vt_output]









    Out[29]:





[{'file_type': 'Macromedia Flash data (compressed), version 9',
  'md5': '16cf037b8c8caad6759afc8c309de0f9',
  'positives': 0,
  'scan_date': '2014-04-14 21:15:29',
  'scan_results': [],
  'total': 51},
 {'file_type': 'PDF document, version 1.6',
  'md5': '40b8c3c98f50e078251ec272620dfb5b',
  'not_found': True},
 {'file_type': 'Java Jar file data (zip)',
  'md5': 'c762b6ba4f560692b6b84ac212cd3ec2',
  'positives': 11,
  'scan_date': '2014-04-11 16:29:27',
  'scan_results': [['Exploit-FUG!C762B6BA4F56', 2],
   ['TROJ_GEN.F47V0408', 1],
   ['Exploit:Java/CVE-2012-1723', 1],
   ['UnclassifiedMalware', 1],
   ['Exploit.Zip.CVE20121723.crxrbn', 1]],
  'total': 50},
 {'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
  'md5': '4410133f571476f2e76e29e61767b557',
  'not_found': True}]



In [30]:

    
# Well VirusTotal only found two of the files (SWF and JAR). The SWF has
# zero positives (we're going to take that with a grain of salt). The PDF
# and PE files don't even show up. So we'll take a closer look at the SWF
# and PE file with some of the workers in workbench.
swf_view = c.work_request('swf_meta','16cf037b8c8caad6759afc8c309de0f9')
swf_view









    Out[30]:





{'swf_meta': {'compressed': True,
  'encoding': 'binary',
  'file_length': 13215,
  'file_size': 7724,
  'file_type': 'Macromedia Flash data (compressed), version 9',
  'filename': 'HTTP-FhsNCh3n8L16lFIiXe.swf',
  'frame_count': 1,
  'frame_rate': 20.0,
  'frame_size': '[xmin: 0 xmax: 300 ymin: 0 ymax: 250]',
  'import_time': '2014-04-15T16:34:48.142000Z',
  'md5': '16cf037b8c8caad6759afc8c309de0f9',
  'mime_type': 'application/x-shockwave-flash',
  'tags': ['[69:FileAttributes] useDirectBlit: 0, useGPU: 0, hasMetadata: 1, actionscript3: 1, useNetwork: 1',
   '[77:Metadata]',
   '[09:SetBackgroundColor] Color: #ffffffff',
   '[43:FrameLabel]',
   '[39:DefineSprite] ID: 2',
   '[82:DoABC]',
   '[76:SymbolClass]',
   '[01:ShowFrame]',
   '[00:End]'],
  'type_tag': 'swf',
  'version': 9}}

Hmmm... not sure...

We need better SWF workers obviously, but naively we can see that actionscript and useNetwork are enabled, also the DoABC (Actionscript ByteCode) tag is there. The SWF file warrants further investigation.



In [31]:

    
pe_view = c.work_request('pe_indicators', '4410133f571476f2e76e29e61767b557')
pe_view









    Out[31]:





{'pe_indicators': {'indicator_list': [{'attributes': ['gettickcount',
     'queryperformancecounter',
     'isdebuggerpresent'],
    'category': 'ANTI_DEBUG',
    'description': 'Imported symbols related to anti-debugging',
    'severity': 3},
   {'attributes': ['cocreateinstance'],
    'category': 'COM_SERVICES',
    'description': 'Imported symbols related to COM or Services',
    'severity': 3},
   {'attributes': ['.malina', '.ndata', '.mdata'],
    'category': 'MALFORMED',
    'description': 'Section(s) with a non-standard name, tamper indication',
    'severity': 3},
   {'attributes': ['getmodulefilenamea',
     'getmodulefilenamew',
     'getmodulehandlea',
     'getstartupinfow',
     'getmodulehandlew'],
    'category': 'PROCESS_MANIPULATION',
    'description': 'Imported symbols related to process manipulation/injection',
    'severity': 3},
   {'attributes': ['filetimetosystemtime', 'getsystemtimeasfiletime'],
    'category': 'PROCESS_SPAWN',
    'description': 'Imported symbols related to spawning a new process',
    'severity': 2},
   {'attributes': ['loadlibraryw', 'getprocaddress'],
    'category': 'STEALTH_LOAD',
    'description': 'Imported symbols related to loading libraries, resources, etc in a sneaky way',
    'severity': 2},
   {'attributes': ['findfirstfilea', 'findnextfilea'],
    'category': 'SYSTEM_PROBE',
    'description': 'Imported symbols related to probing the system',
    'severity': 2},
   {'attributes': ['setfiletime', 'createfilea', 'createfilew'],
    'category': 'SYSTEM_STATE',
    'description': 'Imported symbols related to changing system state',
    'severity': 1}],
  'md5': '4410133f571476f2e76e29e61767b557'}}

Whoa there's a lot of 'interesting' stuff in that executable

Clearly we're not making a definitive statement here but with all the indicators above we highly suspect the executable as being malicious and it at least warrants further investigation.

Network Context

Neo4j: Origin of the four files

Okay we're now pretty sure that at least two of the four files are bad, but exactly where did the files come from within the context of the network information that we can extract from the PCAP file. The image on the right shows all of the relevant information that we gather using the 'pcap_graph' worker called below. The blue nodes are the four files (a close-up image is given below). All graph images were pulled by simply going to the Neo4j graphical interface at http://localhost:7474/browser/.



In [37]:

    
graph = c.work_request('pcap_graph', pcap_md5)
graph









    Out[37]:





{'pcap_graph': {'md5': 'df4f4a2e2bf020be50a12554942edb88',
  'output': 'graph_complete'}}

Full Graph

This graph image below was generated by going to http://localhost:7474/browser and executing this query

match (n)-[r]-() return n,r

File Graph

This graph image below which focuses on the files themselves and the path they took to get to our infected host (orange node in the middle) was generated by going to http://localhost:7474/browser and executing this query

match (s:file),(t{name:'192.168.22.10'}), p=shortestPath((s)--(t)) return p

Timing: File Downloads and DNS requests

Looking at the file graph above, the SWF even looks more suspicious (even though VT has 0 hits out of 51 as of 4-20-2014). So we want to take a look at the timing of the file downloads and DNS requests.

Okay what you'd really do here is an RE analysis of the SWF file to see if indeed is using something like navigateToURL or one of the many other network functions to hit the focondteavrt.ru domain. We would ^love^ for some super nice person to help us out with this and put the results into this notebook. :) You can get the SWF File here SWF_File (obviously proceed with caution!).

But for now we're going to look at timing data which is admittedly a bit more circumstantial.



In [22]:

    
# Lets look at the timing of the dns requests and the file downloads

# Make a new column in both dataframe with a proper datetime stamp
dns_df['time'] = pd.to_datetime(dns_df['ts'], unit='s')
files_df['time'] = pd.to_datetime(files_df['ts'], unit='s')

# Now make time the new index for both dataframes
dns_df.set_index(['time'], inplace=True)
files_df.set_index(['time'], inplace=True)

Dataframes are great for subsetting and filtering/masking

Below we show several examples where we're just interested in particular columns and rows.

First: We're only interested in the set of files above (captured in the python list called 'file_mds') and the domains of interest (captured below in the 'domains' list.
Second: We want to look at only a few columns from both the files dataframe and the dns dataframe.
Third: The results of the operations are placed into the 'interesting_files' and 'interesting_dns' dataframes and then concatenated together.



In [54]:

    
interesting_files = files_df[files_df['md5'].isin(file_md5s)]



In [62]:

    
domains = ['kitchenboss.com.au','www.kitchenboss.com.au','p22x62n0yr63872e-qh6.focondteavrt.ru',
           '2496128308-6.focondteavrt.ru','92.194.4.142.in-addr.arpa']
interesting_dns = dns_df[dns_df['query'].isin(domains)]



In [64]:

    
all_time = pd.concat([interesting_dns[['query','answers','qtype_name']], interesting_files[['md5','mime_type','tx_hosts']]])
all_time.sort_index(inplace=True)
all_time









    Out[64]:






  
    
      
      answers
      md5
      mime_type
      qtype_name
      query
      tx_hosts
    
    
      time
      
      
      
      
      
      
    
  
  
    
      2014-04-09 06:38:54.505653
                          111.223.225.83
                                    NaN
                                 NaN
         A
                         kitchenboss.com.au
                  NaN
    
    
      2014-04-09 06:38:55.370111
       kitchenboss.com.au,111.223.225.83
                                    NaN
                                 NaN
         A
                     www.kitchenboss.com.au
                  NaN
    
    
      2014-04-09 06:39:00.511352
                                     NaN
       16cf037b8c8caad6759afc8c309de0f9
       application/x-shockwave-flash
       NaN
                                        NaN
       111.223.225.83
    
    
      2014-04-09 06:39:00.511405
                                     NaN
       16cf037b8c8caad6759afc8c309de0f9
       application/x-shockwave-flash
       NaN
                                        NaN
       111.223.225.83
    
    
      2014-04-09 06:39:02.607899
                            142.4.194.92
                                    NaN
                                 NaN
         A
       p22x62n0yr63872e-qh6.focondteavrt.ru
                  NaN
    
    
      2014-04-09 06:39:07.436915
                            142.4.194.92
                                    NaN
                                 NaN
         A
               2496128308-6.focondteavrt.ru
                  NaN
    
    
      2014-04-09 06:39:08.690350
                                     NaN
       c762b6ba4f560692b6b84ac212cd3ec2
                     application/jar
       NaN
                                        NaN
         142.4.194.92
    
    
      2014-04-09 06:39:09.285793
                                     NaN
       c762b6ba4f560692b6b84ac212cd3ec2
                     application/jar
       NaN
                                        NaN
         142.4.194.92
    
    
      2014-04-09 06:39:09.862145
                                     NaN
       c762b6ba4f560692b6b84ac212cd3ec2
                     application/jar
       NaN
                                        NaN
         142.4.194.92
    
    
      2014-04-09 06:39:10.103892
                                       -
                                    NaN
                                 NaN
       PTR
                  92.194.4.142.in-addr.arpa
                  NaN
    
    
      2014-04-09 06:39:12.944153
                                     NaN
       40b8c3c98f50e078251ec272620dfb5b
                     application/pdf
       NaN
                                        NaN
         142.4.194.92
    
    
      2014-04-09 06:39:34.991542
                                     NaN
       c762b6ba4f560692b6b84ac212cd3ec2
                     application/jar
       NaN
                                        NaN
         142.4.194.92
    
    
      2014-04-09 06:39:35.784163
                                     NaN
       c762b6ba4f560692b6b84ac212cd3ec2
                     application/jar
       NaN
                                        NaN
         142.4.194.92
    
    
      2014-04-09 06:39:36.369731
                                     NaN
       4410133f571476f2e76e29e61767b557
               application/x-dosexec
       NaN
                                        NaN
         142.4.194.92
    
  

14 rows × 6 columns

Discussion: Sequence of File Downloads and DNS requests

Looking at the table above we see the following sequence of events:

(Beginning) DNS Queries for the kitchenboss.com.au
(+5 seconds) SWF file downloaded
(+2 seconds) DNS Query to p22x62n0yr63872e-qh6.focondteavrt.ru
(+5 seconds) DNS Query to 2496128308-6.focondteavrt.ru
(+1 seconds) 3 JAR files downloads (same one) very close together
(+0.3 seconds) DNS reverse(PTR) query for 142.4.194.92 with no answer
(+2 seconds) PDF file download
(+24 seconds) 2 JAR files downloads (same one)
(+0.6 seconds) PE Exec file download </pre>

So given that the PCAP was captured on a VM that isn't doing anything else and looking at the timing above we could surmise that perhaps the SWF intiated the connection to the focondteavrt.ru domains (circumstantial obviously). As mentioned above the right thing to do here would be to conduct an RE on the SWF file and we would ^love^ for some super nice person to help us out with this and put the results into this notebook. :)

Wrap Up

Well that's it for this notebook. We hope this exercise showed some neato functionality using Workbench, we encourage you to check out the GitHub repository and our other notebooks:

	analyzers	conn_uids	duration	extracted	filename	fuid	is_orig	local_orig	md5	mime_type	parent_fuid	rx_hosts	seen_bytes	sha1	sha256	source	timedout
0	SHA256,MD5,SHA1	CBsm7k2L1ReG6MzFbj	0.168808	-	-	FL607G2jztRHr8Xbz	F	-	bfd039047ebd33f25ffe16d7832d6ceb	text/html	-	192.168.22.10	13944	29fba8185043e6c3748ce6c85f9b1b70fe9c4326	e7517bd0b1654151b5e7284bb8e8b9ec49ce411ff5e1f1...	HTTP	F	...
1	SHA256,MD5,SHA1	Cz5NkR1gnqOMWUSbB8	0.000000	-	-	FH1gD44gUcIfJgq2J2	F	-	14625ee5228c694cf0767e09d12a8d1f	text/plain	-	192.168.22.10	98	5676b357553c6d55f3361dbfab460e3268cb3b55	dfaa8766fad53785e137643e5c685926338f274ec21508...	HTTP	F	...
2	SHA256,MD5,SHA1	CnHfPx8bQQg2aeWo5	0.000056	-	-	FC8A2X2Jla1xMWhr8	F	-	892a543f3abb54e8ec1ada55be3b0649	text/plain	-	192.168.22.10	10220	5847ed101f55d51c53538a7078971e7de8fb6762	8677971b119ccdb82af697ff0e08f218490d15116f221d...	HTTP	F	...
3	SHA256,MD5,SHA1	CQUPQx2b0CMrWv1Dag	0.000000	-	-	F3fwSC4WmD0PKQ6HHg	F	-	0ce8f355891c26c28f057e195e97dcd5	text/html	-	192.168.22.10	2429	3c7b369485cadd585d24be44701e459c8aa54d60	8c7a9c0470563367ab00307b4fb9bb3052d0a27f0b94e6...	HTTP	F	...
4	SHA256,MD5,SHA1	CnB5Oj1YktY5UFZu3k	0.000047	-	-	FLK1KW2a1Bwh6FE6hg	F	-	641cad6161527eb7cdabd4485637634e	text/html	-	192.168.22.10	4022	4bc9306998175f909b167734dad41bd5a6589c82	97b0566bfad0e84bc0eb0db538e66b5dc103a878eb142e...	HTTP	F	...

	AA	RA	RD	TC	TTLs	answers	id.orig_h	id.orig_p	id.resp_h	id.resp_p	proto	qclass	qclass_name	qtype	qtype_name	query	rcode_name	rejected
0	F	T	T	F	14372	111.223.225.83	192.168.22.10	1035	4.2.2.3	53	udp	1	C_INTERNET	1	A	kitchenboss.com.au	NOERROR	F	...
1	F	T	T	F	14371.000000,14371.000000	kitchenboss.com.au,111.223.225.83	192.168.22.10	1035	4.2.2.3	53	udp	1	C_INTERNET	1	A	www.kitchenboss.com.au	NOERROR	F	...
2	F	T	T	F	2601.000000,128.000000	googleapis.l.google.com,74.125.128.95	192.168.22.10	1035	4.2.2.3	53	udp	1	C_INTERNET	1	A	fonts.googleapis.com	NOERROR	F	...
3	F	T	T	F	2587.000000,128.000000	googleapis.l.google.com,74.125.128.95	192.168.22.10	1035	4.2.2.3	53	udp	1	C_INTERNET	1	A	ajax.googleapis.com	NOERROR	F	...
4	F	T	T	F	36507.000000,271.000000	googlecode.l.googleusercontent.com,74.125.128.82	192.168.22.10	1042	4.2.2.3	53	udp	1	C_INTERNET	1	A	html5shim.googlecode.com	NOERROR	F	...

	query	answers	qtype_name
time
2014-04-09 06:38:54.505653	kitchenboss.com.au	111.223.225.83	A
2014-04-09 06:38:55.370111	www.kitchenboss.com.au	kitchenboss.com.au,111.223.225.83	A
2014-04-09 06:38:56.773263	fonts.googleapis.com	googleapis.l.google.com,74.125.128.95	A
2014-04-09 06:38:56.814285	ajax.googleapis.com	googleapis.l.google.com,74.125.128.95	A
2014-04-09 06:38:56.816699	html5shim.googlecode.com	googlecode.l.googleusercontent.com,74.125.128.82	A
2014-04-09 06:38:58.021865	themes.googleusercontent.com	googlehosted.l.googleusercontent.com,173.194.1...	A
2014-04-09 06:38:58.874743	www.google-analytics.com	www-google-analytics.l.google.com,74.125.128.1...	A
2014-04-09 06:39:00.805155	fpdownload2.macromedia.com	fpdownload2.wip4.adobe.com,fpdownload.macromed...	A
2014-04-09 06:39:00.709656	advertdedicated.com	217.12.199.174	A
2014-04-09 06:39:02.607899	p22x62n0yr63872e-qh6.focondteavrt.ru	142.4.194.92	A
2014-04-09 06:39:07.436915	2496128308-6.focondteavrt.ru	142.4.194.92	A
2014-04-09 06:39:10.103892	92.194.4.142.in-addr.arpa	-	PTR

				response_body_len
host	uid	resp_mime_types	uri
2496128308-6.focondteavrt.ru	C430GXDwsHcaJQOJa	-	/f/1397004360/2/2	0
		application/jar	/1397004360.jar	37286
		application/x-dosexec	/f/1397004360/2	273920
	CqcNvW15fd2wz3n943	application/jar	/1397004360.jar	55929
	Csbo4X2kd5yinra0Df	application/pdf	/1397004360.pdf	13006
advertdedicated.com	CAPqz5JkJVfWC7h6	text/plain	/jQuery.js?id=AJAX&PID=1i&cache=91938.89965358726	13160
ajax.googleapis.com	C0jJNB2DFc48BN6IUe	text/plain	/ajax/libs/swfobject/2.2/swfobject.js	10220
	C2Oaxl1r99xWV2Du	-	/ajax/libs/jquery/1.8.3/jquery.min.js	0
	CKE2fs4wR6cxeEpx2g	text/plain	/ajax/libs/jquery/1.8.3/jquery.min.js	93637
	CecMiUQirdvhCW0ja	-	/ajax/libs/swfobject/2.2/swfobject.js	0