PCAP to Pandas Dataframe

This notebook demonstrates a particularily kewl feature of workbench. Quickly and efficiently going from raw data to a Pandas Dataframe.

Here we're using the workbench server to look at a specific case captured by ThreatGlass. The exploited website for this exercise is gold-xxx.net ThreatGlass_Info.

Tools in this Notebook:

Workbench: Open Source Security Framework Workbench GitHub
Bro Network Security Monitor (http://www.bro.org)
Pandas: Python Data Analysis Library (http://pandas.pydata.org)

More Info:

See PCAP_to_Graph for a short notebook on turning this PCAP into a Neo4j graph.
See Workbench Demo Notebook for a lot more info on using workbench.

Lets start up the workbench server...

Run the workbench server (from somewhere, for the demo we're just going to start a local one)

$ workbench_server



In [5]:

    
# Lets start to interact with workbench, please note there is NO specific client to workbench,
# Just use the ZeroRPC Python, Node.js, or CLI interfaces.
import zerorpc
c = zerorpc.Client(timeout=120)
c.connect("tcp://127.0.0.1:4242")









    Out[5]:





[None]

Read in the Data

The data is pulled from [ThreatGlass](http://www.threatglass.com/), the exploited website for this exercise is gold-xxx.net [ThreatGlass_Info](http://www.threatglass.com/malicious_urls/141deabbc8741175d9f51559cf4ef3dd?process_date=2014-05-29).



In [6]:

    
# Load in the PCAP file
with open('../data/pcap/gold_xxx.pcap','rb') as f:
    pcap_md5 = c.store_sample(f.read(), 'gold_xxx', 'pcap')



In [10]:

    
# We can also ask workbench for a python dictionary of all the info from this PCAP,
# because sometimes visualization are useful and sometimes organized data is useful.
output = c.work_request('view_pcap_details', pcap_md5)['view_pcap_details']
output









    Out[10]:





{'bro_logs': {'conn_log': 'e6b210abca2299821a31e4448260f5da',
  'dhcp_log': 'cf081f397ae93aaeada91cb68ac86168',
  'dns_log': '3af86fbc1bd125c83160d1f5d0cafa39',
  'files_log': '468dca88929d2ed54b26ff81fe1b1700',
  'http_log': 'b76384cf1c5179bccf64f8320882af94',
  'packet_filter_log': '87676f840e783c2dc537efe51acfc075',
  'weird_log': 'a8d45ed8b0ddb0b9e115d05fe9f65dea'},
 'extracted_files': [{'entropy': 6.845206689967475,
   'file_size': 219136,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': '17d786f9a3ac2b54cf29122cd58bdabe',
   'sha256': 'cc5bd99f15d2b2c3153ca132245b2780fac08e66c8ed0dc096919b81beb886b5',
   'ssdeep': '3072:ddZuptT5MSMLp30xUiteu55Cva5xmSnaCOnQe+kAiE7jtMH4jIT9m26zD2FzXunl:dneTSjaxjeu50va5xm2jtcUQR2'},
  {'entropy': 6.845206689967475,
   'file_size': 219136,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': '17d786f9a3ac2b54cf29122cd58bdabe',
   'sha256': 'cc5bd99f15d2b2c3153ca132245b2780fac08e66c8ed0dc096919b81beb886b5',
   'ssdeep': '3072:ddZuptT5MSMLp30xUiteu55Cva5xmSnaCOnQe+kAiE7jtMH4jIT9m26zD2FzXunl:dneTSjaxjeu50va5xm2jtcUQR2'},
  {'entropy': 6.672123050634468,
   'file_size': 149,
   'file_type': 'data',
   'md5': '016d482b6cf4fda57240b539e1468794',
   'sha256': 'b5b1c65f25374362658c6c8ffbdf15fa61f7c09890eb30bd9e835906b2fdd9c6',
   'ssdeep': '3:WH9TISb/x9doI+YBU4eqezcW73YOJKVsv/dXN5bgZMlR/73bU4ZEMGOB9:6ISbvV+2bHYn/d95bg2zbDEI7'},
  {'entropy': 3.7725899061892,
   'file_size': 1150,
   'file_type': 'MS Windows icon resource - 1 icon',
   'md5': '8f8e6f2edc6d89b9632d7fa73ee4f5ea',
   'sha256': 'af0db6b81e131069926379a610d770973d24b2be92a99cb5c1a30b5fc4b13e5e',
   'ssdeep': '12:jG28R+X1Zxr94pUfHudkLaJaJaJaJaaJaJaJgEhKqEYax5adqwtmUUlR55n:i28AxrVfHsZUUUXUUVhz1l8UUj'},
  {'entropy': 6.672123050634468,
   'file_size': 149,
   'file_type': 'data',
   'md5': '016d482b6cf4fda57240b539e1468794',
   'sha256': 'b5b1c65f25374362658c6c8ffbdf15fa61f7c09890eb30bd9e835906b2fdd9c6',
   'ssdeep': '3:WH9TISb/x9doI+YBU4eqezcW73YOJKVsv/dXN5bgZMlR/73bU4ZEMGOB9:6ISbvV+2bHYn/d95bg2zbDEI7'},
  {'entropy': 6.845206689967475,
   'file_size': 219136,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': '17d786f9a3ac2b54cf29122cd58bdabe',
   'sha256': 'cc5bd99f15d2b2c3153ca132245b2780fac08e66c8ed0dc096919b81beb886b5',
   'ssdeep': '3072:ddZuptT5MSMLp30xUiteu55Cva5xmSnaCOnQe+kAiE7jtMH4jIT9m26zD2FzXunl:dneTSjaxjeu50va5xm2jtcUQR2'},
  {'entropy': 6.845206689967475,
   'file_size': 219136,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': '17d786f9a3ac2b54cf29122cd58bdabe',
   'sha256': 'cc5bd99f15d2b2c3153ca132245b2780fac08e66c8ed0dc096919b81beb886b5',
   'ssdeep': '3072:ddZuptT5MSMLp30xUiteu55Cva5xmSnaCOnQe+kAiE7jtMH4jIT9m26zD2FzXunl:dneTSjaxjeu50va5xm2jtcUQR2'},
  {'entropy': 6.845206689967475,
   'file_size': 219136,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': '17d786f9a3ac2b54cf29122cd58bdabe',
   'sha256': 'cc5bd99f15d2b2c3153ca132245b2780fac08e66c8ed0dc096919b81beb886b5',
   'ssdeep': '3072:ddZuptT5MSMLp30xUiteu55Cva5xmSnaCOnQe+kAiE7jtMH4jIT9m26zD2FzXunl:dneTSjaxjeu50va5xm2jtcUQR2'},
  {'entropy': 7.306977991950132,
   'file_size': 12761,
   'file_type': 'PDF document, version 1.6',
   'md5': 'b85cb7cee9e145ac4dfb7e8f1870e360',
   'sha256': 'af1f3785ea4a2be08cb13c6b32a3cf71bbe24f50530b591ddb3be9d363d2e6a3',
   'ssdeep': '384:SW3+jGeEBnZgazaw0eUqfgij6aUi0f+xOfESUxDnDKT:fW+h0eho5YxOIFKT'},
  {'entropy': 7.855433934299751,
   'file_size': 10629,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'd4c56b4d0ba9fbbd1028b83401eec133',
   'sha256': '155527017e6dbeec01c1b0a99f9db9dcf7a0b4ea902fb6586b463b2836e481bc',
   'ssdeep': '192:rh3cJzBpxUybGZXwnYzG2/HCtLVl+Wg/+czIPKGgGmnzmIlbA:t3KLFbGdqU6tCWdc8PKUwmf'},
  {'entropy': 6.845206689967475,
   'file_size': 219136,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': '17d786f9a3ac2b54cf29122cd58bdabe',
   'sha256': 'cc5bd99f15d2b2c3153ca132245b2780fac08e66c8ed0dc096919b81beb886b5',
   'ssdeep': '3072:ddZuptT5MSMLp30xUiteu55Cva5xmSnaCOnQe+kAiE7jtMH4jIT9m26zD2FzXunl:dneTSjaxjeu50va5xm2jtcUQR2'},
  {'entropy': 4.869185904780487,
   'file_size': 7692,
   'file_type': 'assembler source, ASCII text',
   'md5': '739993fe99fcb74d283f7faa1617984b',
   'sha256': '2882a10573b80d8c6cfe2137160f993523082dd20948cbbfcb2458128d0a3043',
   'ssdeep': '192:/2TFrmh1FVFlrfVWIXXoJAaF7Fe7cFJeS1NM8M8D2f:uTFryFVFlr9WcaF7FeoF0mHm'},
  {'entropy': 6.570728034441538,
   'file_size': 156160,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': 'c37f0a7b0249c91a43e749ad3660fb55',
   'sha256': 'e0124e97a99c40b6f48760ab90d2ed009f9b4c661f3e98c86c848b876588c47d',
   'ssdeep': '3072:Dd/1aVzIF8P3h3cJUitGoDp+XSB5mKAKYOnQe+FSDHslun81L:DraVzTP3GJjGoDYXSB5mkHet'},
  {'entropy': 7.855433934299751,
   'file_size': 10629,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'd4c56b4d0ba9fbbd1028b83401eec133',
   'sha256': '155527017e6dbeec01c1b0a99f9db9dcf7a0b4ea902fb6586b463b2836e481bc',
   'ssdeep': '192:rh3cJzBpxUybGZXwnYzG2/HCtLVl+Wg/+czIPKGgGmnzmIlbA:t3KLFbGdqU6tCWdc8PKUwmf'},
  {'entropy': 6.570728034441538,
   'file_size': 156160,
   'file_type': 'PE32 executable (GUI) Intel 80386, for MS Windows',
   'md5': 'c37f0a7b0249c91a43e749ad3660fb55',
   'sha256': 'e0124e97a99c40b6f48760ab90d2ed009f9b4c661f3e98c86c848b876588c47d',
   'ssdeep': '3072:Dd/1aVzIF8P3h3cJUitGoDp+XSB5mKAKYOnQe+FSDHslun81L:DraVzTP3GJjGoDYXSB5mkHet'},
  {'entropy': 7.855433934299751,
   'file_size': 10629,
   'file_type': 'Java Jar file data (zip)',
   'md5': 'd4c56b4d0ba9fbbd1028b83401eec133',
   'sha256': '155527017e6dbeec01c1b0a99f9db9dcf7a0b4ea902fb6586b463b2836e481bc',
   'ssdeep': '192:rh3cJzBpxUybGZXwnYzG2/HCtLVl+Wg/+czIPKGgGmnzmIlbA:t3KLFbGdqU6tCWdc8PKUwmf'}],
 'md5': 'c8e58ff22b9a8e48838373fbb1692bdd'}

If the next line of code doesn't blow your mind, you aren't paying attention!

Thanks to ZeroRPC all of the bro logs are streamed from server to client with NETWORK STREAMING GENERATORS, those highly efficient generators are zero-copy and stream data directly into Pandas Dataframes.

For more on client/server generators and client-contructed/server-executed generator pipelines see our super spiffy Generator Pipelines notebook.



In [11]:

    
# Critical Code: Transition from Bro logs to Pandas Dataframes
# This one line of code populates dataframes from the Bro logs, 
# streaming client/server generators, zero-copy, efficient, awesome...
import pandas as pd
dataframes = {name:pd.DataFrame(c.stream_sample(bro_log, None)) for name, bro_log in output['bro_logs'].iteritems()}

Lets look at the Data

We're going to use some nice functionality in the Pandas dataframe to look at our network data, specifically we're going to group by host, host-ip, mime_type and uri. The last column represents the aggregated sum of response_body_len.

This type of operation is really just scratching the surface when it comes to dataframes, so quickly and efficiently populating a dataframe is super awesome.



In [20]:

    
# Look at DNS logs
dataframes['dns_log'][['query','answers','qtype_name']].head(10)









    Out[20]:






  
    
      
      query
      answers
      qtype_name
    
  
  
    
      0
                    gold-xxx.net
                                           178.208.85.60
       A
    
    
      1
                counter.yadro.ru
       88.212.196.105,88.212.196.122,88.212.196.123,8...
       A
    
    
      2
                      picsee.net
                                            62.75.207.72
       A
    
    
      3
              counter.rambler.ru
       81.19.88.95,81.19.88.96,81.19.88.102,81.19.88....
       A
    
    
      4
             freeroomhostelz.com
                                          91.194.254.195
       A
    
    
      5
       57d9bf9co3qbc.paroonic.ru
                                            80.72.37.112
       A
    
    
      6
         323210841-1.paroonic.ru
                                            80.72.37.112
       A
    
    
      7
            domainsfullkolls.biz
                                           94.242.216.61
       A
    
    
      8
            update.microsoft.com
       update.microsoft.com.nsatc.net,157.56.96.60,15...
       A
    
    
      9
            update.microsoft.com
       update.microsoft.com.nsatc.net,65.55.138.114,1...
       A
    
  

10 rows × 3 columns



In [21]:

    
# Look at Conn logs
dataframes['conn_log'].head(10)









    Out[21]:






  
    
      
      conn_state
      duration
      history
      id.orig_h
      id.orig_p
      id.resp_h
      id.resp_p
      local_orig
      missed_bytes
      orig_bytes
      orig_ip_bytes
      orig_pkts
      proto
      resp_bytes
      resp_ip_bytes
      resp_pkts
      service
      ts
      tunnel_parents
      uid
    
  
  
    
      0
       RSTO
       0.631127
        ShADadR
       192.168.39.10
       1036
        178.208.85.60
       80
       -
       0
       587
        955
        9
       tcp
       6459
       6743
       7
       http
       1.401401e+09
       (empty)
       CBdyLA2FydtnrvzeBi
    
    
      1
       RSTO
       0.150052
            ShR
       192.168.39.10
       1037
        178.208.85.60
       80
       -
       0
         0
         88
        2
       tcp
          0
         44
       1
          -
       1.401401e+09
       (empty)
       C0gD932Pau36gzSy93
    
    
      2
       RSTO
       0.151529
            ShR
       192.168.39.10
       1040
        178.208.85.60
       80
       -
       0
         0
         88
        2
       tcp
          0
         44
       1
          -
       1.401401e+09
       (empty)
       CXTGqY2XNUSXRynhFg
    
    
      3
       RSTO
       0.152817
            ShR
       192.168.39.10
       1039
        178.208.85.60
       80
       -
       0
         0
         88
        2
       tcp
          0
         44
       1
          -
       1.401401e+09
       (empty)
       CNLjfJ1CSChTSRPRO1
    
    
      4
       RSTO
       0.154018
            ShR
       192.168.39.10
       1038
        178.208.85.60
       80
       -
       0
         0
         88
        2
       tcp
          0
         44
       1
          -
       1.401401e+09
       (empty)
       CWUpVH2eZ0g3HgXe1k
    
    
      5
       RSTO
        0.15395
            ShR
       192.168.39.10
       1041
        178.208.85.60
       80
       -
       0
         0
         88
        2
       tcp
          0
         44
       1
          -
       1.401401e+09
       (empty)
       C7WF992bGceJEID9k5
    
    
      6
         SF
       0.802815
       ShADdfFa
       192.168.39.10
       1051
       88.212.196.105
       80
       -
       0
       653
        901
        6
       tcp
       1621
       1829
       5
       http
       1.401401e+09
       (empty)
       C05Lh44T4Od4nxKuPc
    
    
      7
         SF
       0.009575
             Dd
       192.168.39.10
       1048
              4.2.2.3
       53
       -
       0
        28
         56
        1
       udp
         44
         72
       1
        dns
       1.401401e+09
       (empty)
       CoK5EL2wWO4GXS26D7
    
    
      8
       RSTO
       6.797592
       ShADadfR
       192.168.39.10
       1055
       91.194.254.195
       80
       -
       0
       326
        574
        6
       tcp
        801
        969
       4
       http
       1.401401e+09
       (empty)
       CiDYwq4jpcz4IGmX0g
    
    
      9
       RSTO
       8.367047
        ShADdfR
       192.168.39.10
       1054
          81.19.88.95
       80
       -
       0
       799
       1287
       12
       tcp
       7618
       7946
       8
       http
       1.401401e+09
       (empty)
       CDR6w61LKf7XpsHC36
    
  

10 rows × 20 columns



In [24]:

    
# Simple Stats with Pandas Dataframe
dataframes['conn_log'][['missed_bytes','orig_ip_bytes','resp_ip_bytes','resp_pkts']].describe()









    Out[24]:






  
    
      
      missed_bytes
      orig_ip_bytes
      resp_ip_bytes
      resp_pkts
    
  
  
    
      count
       65
         65.000000
           65.000000
        65.000000
    
    
      mean
        0
       1221.769231
        39791.953846
        17.107692
    
    
      std
        0
       1685.608669
        74050.375527
        27.989464
    
    
      min
        0
         48.000000
            0.000000
         0.000000
    
    
      25%
        0
         63.000000
           79.000000
         1.000000
    
    
      50%
        0
        464.000000
          365.000000
         4.000000
    
    
      75%
        0
       1600.000000
        51482.000000
        18.000000
    
    
      max
        0
       5312.000000
       223664.000000
       106.000000
    
  

8 rows × 4 columns



In [64]:

    
# Simple Filtering with Pandas Dataframe
not_80_df = dataframes['conn_log'][dataframes['conn_log']['id.resp_p'] != 80]
not_80_df.head(10)









    Out[64]:






  
    
      
      conn_state
      duration
      history
      id.orig_h
      id.orig_p
      id.resp_h
      id.resp_p
      local_orig
      missed_bytes
      orig_bytes
      orig_ip_bytes
      orig_pkts
      proto
      resp_bytes
      resp_ip_bytes
      resp_pkts
      service
      ts
      tunnel_parents
      uid
      
    
  
  
    
      7 
       SF
       0.009575
       Dd
       192.168.39.10
       1048
               4.2.2.3
         53
       -
       0
        28
        56
       1
       udp
        44
        72
       1
        dns
       1.401401e+09
       (empty)
       CoK5EL2wWO4GXS26D7
      ...
    
    
      11
       SF
       11.55683
       Dd
       192.168.39.10
       1035
               4.2.2.3
         53
       -
       0
       221
       389
       6
       udp
       669
       837
       6
        dns
       1.401401e+09
       (empty)
       CBWgV42cYiaGqu9Z25
      ...
    
    
      12
       S0
       0.000911
        D
             0.0.0.0
         68
       255.255.255.255
         67
       -
       0
       626
       682
       2
       udp
         0
         0
       0
       dhcp
       1.401401e+09
       (empty)
        CKYJGa15i6s1Rpq0Y
      ...
    
    
      13
       SF
       2.110374
       dD
       192.168.39.10
         68
          192.168.39.1
         67
       -
       0
       314
       342
       1
       udp
       900
       984
       3
       dhcp
       1.401401e+09
       (empty)
       C6zKAr3iKNDLKufq4h
      ...
    
    
      15
       SF
       0.027165
       Dd
       192.168.39.10
       1035
               4.2.2.3
         53
       -
       0
        38
        66
       1
       udp
        54
        82
       1
        dns
       1.401401e+09
       (empty)
        C50uAqkg2gpAF0eH5
      ...
    
    
      26
       S0
       6.024315
        D
       192.168.39.10
       1058
       239.255.255.250
       1900
       -
       0
       399
       483
       3
       udp
         0
         0
       0
          -
       1.401401e+09
       (empty)
       CAirC04eZatVhqLFvc
      ...
    
    
      41
       SF
       0.008703
       Dd
       192.168.39.10
       1035
               4.2.2.3
         53
       -
       0
        38
        66
       1
       udp
       114
       142
       1
        dns
       1.401401e+09
       (empty)
       Cjnp9B3tRdimStImlc
      ...
    
    
      42
       SF
       0.009521
       Dd
       192.168.39.10
       1048
               4.2.2.3
         53
       -
       0
        38
        66
       1
       udp
       114
       142
       1
        dns
       1.401401e+09
       (empty)
       Cl5qe21lfeB8wXEZg4
      ...
    
    
      43
       SF
       0.021553
       Dd
       192.168.39.10
       1066
               8.8.4.4
         53
       -
       0
        35
        63
       1
       udp
        51
        79
       1
        dns
       1.401401e+09
       (empty)
       Cn9DTl3yp91dxD8hFl
      ...
    
    
      44
       SF
       0.001862
       Dd
       192.168.39.10
       1068
               8.8.4.4
         53
       -
       0
        35
        63
       1
       udp
        51
        79
       1
        dns
       1.401401e+09
       (empty)
       Cy7sm63NXzJ1XuxTpj
      ...
    
  

10 rows × 21 columns



In [66]:

    
# Now we group by host and show the different response mime types for each host
group_host = dataframes['http_log'].groupby(['host','id.resp_h','resp_mime_types','uri'])[['response_body_len']].sum()
group_host









    Out[66]:






  
    
      
      
      
      
      response_body_len
    
    
      host
      id.resp_h
      resp_mime_types
      uri
      
    
  
  
    
      323210841-1.paroonic.ru
      80.72.37.112
      application/jar
      /1401379560.jar
        31887
    
    
      application/pdf
      /1401379560.pdf
        12761
    
    
      application/x-dosexec
      /f/1401379560/5/x00cf6b534e520804090407000700080150050f0304045106565601;1;5
       156160
    
    
      /f/1401379560/6
       156160
    
    
      text/html
      /1401379560.htm
        21138
    
    
      57d9bf9co3qbc.paroonic.ru
      80.72.37.112
      text/html
      /
        89366
    
    
      counter.rambler.ru
      81.19.88.95
      image/gif
      /top100.scn?2148353&rn=2061653628&v=0.3i&bs=780x427&ce=1&rf&en=windows-1251&pt=Download porn free clips, sex porn video&cd=24-bit&sr=800x600&la=en-us&ja=1&acn=Mozilla&an=Microsoft Internet Explorer&pl=Win32&tz=240&fv=10.0 r32&sv&le=1
           49
    
    
      text/plain
      /top100.jcn?2148353
         6853
    
    
      counter.yadro.ru
      88.212.196.105
      image/gif
      /hit?q;t28.6;r;s800*600*24;uhttp://gold-xxx.net/;0.6712898022427673
          763
    
    
      text/html
      /hit?t28.6;r;s800*600*24;uhttp://gold-xxx.net/;0.6712898022427673
           32
    
    
      freeroomhostelz.com
      91.194.254.195
      text/html
      /
          413
    
    
      gold-xxx.net
      178.208.85.60
      -
      /css/engine.css
            0
    
    
      image/gif
      /templates/xxibeka/images/all_bg.png
         2247
    
    
      /templates/xxibeka/images/col-mid-r.gif
          107
    
    
      /templates/xxibeka/images/col-top-r.gif
         1122
    
    
      /templates/xxibeka/images/news.gif
         1416
    
    
      /templates/xxibeka/images/search.gif
          254
    
    
      image/jpeg
      /templates/xxibeka/images/content.jpg
         1631
    
    
      /templates/xxibeka/images/header.jpg
         3630
    
    
      /templates/xxibeka/images/logo.jpg
        47378
    
    
      /uploads/posts/2014-05/1401395470_eavlolprh2nywzf.jpeg
        34600
    
    
      /uploads/posts/2014-05/1401395923_8z4umchk5zwstxw.jpeg
        34866
    
    
      /uploads/posts/2014-05/1401396326_clybb9irt17pygm.jpeg
        31199
    
    
      /uploads/posts/2014-05/1401396909_n8xfse70ugli1cz.jpeg
        51466
    
    
      /uploads/posts/2014-05/1401397123_avgnhrqeylibgyk.jpeg
        49333
    
    
      /uploads/posts/2014-05/1401397541_qbdkzupvm18lgre.jpeg
        36246
    
    
      /uploads/posts/2014-05/1401397822_03ifpwcqtqmz65r.jpeg
        37594
    
    
      /uploads/posts/2014-05/1401398230_zkihngphpwvsuxf.jpeg
        36792
    
    
      /uploads/posts/2014-05/1401400046_jpfy4mfyskolthi.jpeg
        38434
    
    
      /uploads/posts/2014-05/1401400094_aiax6go3xlpkr8d.jpeg
        44885
    
    
      image/png
      /templates/xxibeka/images/nav.png
          184
    
    
      /templates/xxibeka/images/send.png
         3826
    
    
      image/x-icon
      /favicon.ico
         1150
    
    
      text/html
      /
        30992
    
    
      /css/engine.css
          290
    
    
      /css/site.css
          288
    
    
      /css/style.css
          289
    
    
      /engine/classes/js/jquery.js
        94840
    
    
      /engine/classes/js/js_edit.js
        11012
    
    
      /templates/xxibeka/css/template_css.css
          314
    
    
      /templates/xxibeka/images/h1bg.gif
          309
    
    
      /templates/xxibeka/images/m-bottom.gife
          314
    
    
      /templates/xxibeka/images/m-center.gife
          314
    
    
      /templates/xxibeka/images/m-top.gife
          311
    
    
      text/plain
      /engine/classes/js/dialog.js
        47054
    
    
      /engine/classes/js/dle_ajax.js
         4931
    
    
      /engine/classes/js/effects.js
        13628
    
    
      /engine/classes/js/menu.js
         2992
    
    
      /templates/xxibeka/css/site.css
          595
    
    
      /templates/xxibeka/css/style.css
        13241
    
    
      text/x-asm
      /templates/xxibeka/css/engine.css
         7692
    
    
      newsbrontima.com
      192.64.115.91
      -
      /5jeno9e6lbsffl
            0
    
    
      text/plain
      /epb4y7viha3
           14
    
    
      online-serial.net
      178.208.83.15
      application/x-dosexec
      /lok2.exe
       438272
    
    
      picsee.net
      62.75.207.72
      image/jpeg
      /upload/2014-05-30/258f383e1712.jpeg
        61906
    
    
      /upload/2014-05-30/552a444e0929.jpeg
        25267
    
    
      /upload/2014-05-30/8db84be028a0.jpeg
        62012
    
    
      /upload/2014-05-30/a906f803f653.jpeg
        71508
    
    
      /upload/2014-05-30/ffb0f9075f26.jpeg
        65423
    
    
      t2bot.ru
      178.208.83.55
      application/x-dosexec
      /lok2.exe
       876544
    
    
      
      
      
      
      ...
    
  

62 rows × 1 columns



In [75]:

    
# Plotting defaults
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['font.size'] = 12.0
plt.rcParams['figure.figsize'] = 15.0, 8.0



In [82]:

    
# Plot hosts and mime-types
plot_df = dataframes['http_log'].groupby(['host','resp_mime_types'])[['response_body_len']].sum().unstack()
plot_df['response_body_len'].plot(kind='bar', stacked=True)
plt.xlabel('Domain')
plt.ylabel('Response Bytes')
plt.xticks(rotation=45, ha='right')









    Out[82]:





(array([  0.5,   1.5,   2.5,   3.5,   4.5,   5.5,   6.5,   7.5,   8.5,
          9.5,  10.5]), <a list of 11 Text xticklabel objects>)

Wrap Up

Well for this short notebook we used 6-7 lines of python to go from PCAP file to a Pandas Dataframe and then we did a bunch of kewl stuff using the Dataframe. We hope this exercise showed some neato functionality using Workbench, we encourage you to check out the GitHub repository and our other notebooks:

PCAP_to_Graph for a short notebook on turning this PCAP into a Neo4j graph.
Workbench Demo general introduction to Workbench.
PCAP_DriveBy a detail look at a Web DriveBy from the ThreatGlass repository.
PE File Sim Graph using Neo4j to generate a similarity graph using PE File features.
Generator Pipelines using the client/server streaming generators to demonstrate 'chaining' generators.

	query	answers	qtype_name
0	gold-xxx.net	178.208.85.60	A
1	counter.yadro.ru	88.212.196.105,88.212.196.122,88.212.196.123,8...	A
2	picsee.net	62.75.207.72	A
3	counter.rambler.ru	81.19.88.95,81.19.88.96,81.19.88.102,81.19.88....	A
4	freeroomhostelz.com	91.194.254.195	A
5	57d9bf9co3qbc.paroonic.ru	80.72.37.112	A
6	323210841-1.paroonic.ru	80.72.37.112	A
7	domainsfullkolls.biz	94.242.216.61	A
8	update.microsoft.com	update.microsoft.com.nsatc.net,157.56.96.60,15...	A
9	update.microsoft.com	update.microsoft.com.nsatc.net,65.55.138.114,1...	A

	conn_state	duration	history	id.orig_h	id.orig_p	id.resp_h	id.resp_p	local_orig	orig_bytes	orig_ip_bytes	orig_pkts	proto	resp_bytes	resp_ip_bytes	resp_pkts	service	ts	tunnel_parents	uid
0	RSTO	0.631127	ShADadR	192.168.39.10	1036	178.208.85.60	80	-	587	955	9	tcp	6459	6743	7	http	1.401401e+09	(empty)	CBdyLA2FydtnrvzeBi
1	RSTO	0.150052	ShR	192.168.39.10	1037	178.208.85.60	80	-	0	88	2	tcp	0	44	1	-	1.401401e+09	(empty)	C0gD932Pau36gzSy93
2	RSTO	0.151529	ShR	192.168.39.10	1040	178.208.85.60	80	-	0	88	2	tcp	0	44	1	-	1.401401e+09	(empty)	CXTGqY2XNUSXRynhFg
3	RSTO	0.152817	ShR	192.168.39.10	1039	178.208.85.60	80	-	0	88	2	tcp	0	44	1	-	1.401401e+09	(empty)	CNLjfJ1CSChTSRPRO1
4	RSTO	0.154018	ShR	192.168.39.10	1038	178.208.85.60	80	-	0	88	2	tcp	0	44	1	-	1.401401e+09	(empty)	CWUpVH2eZ0g3HgXe1k
5	RSTO	0.15395	ShR	192.168.39.10	1041	178.208.85.60	80	-	0	88	2	tcp	0	44	1	-	1.401401e+09	(empty)	C7WF992bGceJEID9k5
6	SF	0.802815	ShADdfFa	192.168.39.10	1051	88.212.196.105	80	-	653	901	6	tcp	1621	1829	5	http	1.401401e+09	(empty)	C05Lh44T4Od4nxKuPc
7	SF	0.009575	Dd	192.168.39.10	1048	4.2.2.3	53	-	28	56	1	udp	44	72	1	dns	1.401401e+09	(empty)	CoK5EL2wWO4GXS26D7
8	RSTO	6.797592	ShADadfR	192.168.39.10	1055	91.194.254.195	80	-	326	574	6	tcp	801	969	4	http	1.401401e+09	(empty)	CiDYwq4jpcz4IGmX0g
9	RSTO	8.367047	ShADdfR	192.168.39.10	1054	81.19.88.95	80	-	799	1287	12	tcp	7618	7946	8	http	1.401401e+09	(empty)	CDR6w61LKf7XpsHC36

	missed_bytes	orig_ip_bytes	resp_ip_bytes	resp_pkts
count	65	65.000000	65.000000	65.000000
mean	0	1221.769231	39791.953846	17.107692
std	0	1685.608669	74050.375527	27.989464
min	0	48.000000	0.000000	0.000000
25%	0	63.000000	79.000000	1.000000
50%	0	464.000000	365.000000	4.000000
75%	0	1600.000000	51482.000000	18.000000
max	0	5312.000000	223664.000000	106.000000

				response_body_len
host	id.resp_h	resp_mime_types	uri
323210841-1.paroonic.ru	80.72.37.112	application/jar	/1401379560.jar	31887
		application/pdf	/1401379560.pdf	12761
		application/x-dosexec	/f/1401379560/5/x00cf6b534e520804090407000700080150050f0304045106565601;1;5	156160
		application/x-dosexec	/f/1401379560/6	156160
		text/html	/1401379560.htm	21138
57d9bf9co3qbc.paroonic.ru	80.72.37.112	text/html	/	89366
counter.rambler.ru	81.19.88.95	image/gif	/top100.scn?2148353&rn=2061653628&v=0.3i&bs=780x427&ce=1&rf&en=windows-1251&pt=Download porn free clips, sex porn video&cd=24-bit&sr=800x600&la=en-us&ja=1&acn=Mozilla&an=Microsoft Internet Explorer&pl=Win32&tz=240&fv=10.0 r32&sv&le=1	49
counter.rambler.ru	81.19.88.95	text/plain	/top100.jcn?2148353	6853
counter.yadro.ru	88.212.196.105	image/gif	/hit?q;t28.6;r;s80060024;uhttp://gold-xxx.net/;0.6712898022427673	763
counter.yadro.ru	88.212.196.105	text/html	/hit?t28.6;r;s80060024;uhttp://gold-xxx.net/;0.6712898022427673	32
freeroomhostelz.com	91.194.254.195	text/html	/	413
gold-xxx.net	178.208.85.60	-	/css/engine.css	0
		image/gif	/templates/xxibeka/images/all_bg.png	2247
			/templates/xxibeka/images/col-mid-r.gif	107
			/templates/xxibeka/images/col-top-r.gif	1122
			/templates/xxibeka/images/news.gif	1416
			/templates/xxibeka/images/search.gif	254
		image/jpeg	/templates/xxibeka/images/content.jpg	1631
			/templates/xxibeka/images/header.jpg	3630
			/templates/xxibeka/images/logo.jpg	47378
			/uploads/posts/2014-05/1401395470_eavlolprh2nywzf.jpeg	34600
			/uploads/posts/2014-05/1401395923_8z4umchk5zwstxw.jpeg	34866
			/uploads/posts/2014-05/1401396326_clybb9irt17pygm.jpeg	31199
			/uploads/posts/2014-05/1401396909_n8xfse70ugli1cz.jpeg	51466
			/uploads/posts/2014-05/1401397123_avgnhrqeylibgyk.jpeg	49333
			/uploads/posts/2014-05/1401397541_qbdkzupvm18lgre.jpeg	36246
			/uploads/posts/2014-05/1401397822_03ifpwcqtqmz65r.jpeg	37594
			/uploads/posts/2014-05/1401398230_zkihngphpwvsuxf.jpeg	36792
			/uploads/posts/2014-05/1401400046_jpfy4mfyskolthi.jpeg	38434
			/uploads/posts/2014-05/1401400094_aiax6go3xlpkr8d.jpeg	44885
		image/png	/templates/xxibeka/images/nav.png	184
		image/png	/templates/xxibeka/images/send.png	3826
		image/x-icon	/favicon.ico	1150
		text/html	/	30992
			/css/engine.css	290
			/css/site.css	288
			/css/style.css	289
			/engine/classes/js/jquery.js	94840
			/engine/classes/js/js_edit.js	11012
			/templates/xxibeka/css/template_css.css	314
			/templates/xxibeka/images/h1bg.gif	309
			/templates/xxibeka/images/m-bottom.gife	314
			/templates/xxibeka/images/m-center.gife	314
			/templates/xxibeka/images/m-top.gife	311
		text/plain	/engine/classes/js/dialog.js	47054
			/engine/classes/js/dle_ajax.js	4931
			/engine/classes/js/effects.js	13628
			/engine/classes/js/menu.js	2992
			/templates/xxibeka/css/site.css	595
			/templates/xxibeka/css/style.css	13241
		text/x-asm	/templates/xxibeka/css/engine.css	7692
newsbrontima.com	192.64.115.91	-	/5jeno9e6lbsffl	0
newsbrontima.com	192.64.115.91	text/plain	/epb4y7viha3	14
online-serial.net	178.208.83.15	application/x-dosexec	/lok2.exe	438272
picsee.net	62.75.207.72	image/jpeg	/upload/2014-05-30/258f383e1712.jpeg	61906
			/upload/2014-05-30/552a444e0929.jpeg	25267
			/upload/2014-05-30/8db84be028a0.jpeg	62012
			/upload/2014-05-30/a906f803f653.jpeg	71508
			/upload/2014-05-30/ffb0f9075f26.jpeg	65423
t2bot.ru	178.208.83.55	application/x-dosexec	/lok2.exe	876544
				...