There are four main tables in the XML. They are Views, Categories, Weaknesses, and Coumpound_Elements. In the XML, the contents of these main categories are messy, however, on the website these main tables mostly follow the below hierarchy. The four main fields contain the entries related to that category. The hierarchy functions are explained in detail in CWE Introduction.
Originally we believed that it was not possible to visualize the deeper levels in CWE since Foamtree support indicated that 'maxGroupLevelsDrawn' attribute should not be forced to change. So instead we decreased the values of 'groupBorderWidth' to 1.5 and 'groupMinDiameter' to 0. This was a trade-off with the width of the deeper level ceels to visualize more number of nodes.
The trial version of Liquid XML does not outline the code until the following setting is changed: Tools --> Options --> Environment --> Performance --> change 'Turn off Document Outlining for files larger than' 25 MB.
In [1]:
from IPython.display import Image
Image(filename='img/types_of_nodes_and_their_relationships_in_CWE.PNG')
Out[1]:
The following code creates a foamtree visualization from cwe version 1.0 to cwe version 2.12, except for cwe version 2.11.
Currently --gephi is not functional because of multiple parents for its nodes (unlike CAPEC).
In [2]:
import csv
import json
import xml.etree.ElementTree as ET
import sys
import os
import argparse
parser = argparse.ArgumentParser(description = 'CWE XML visualisation program.')
parser.add_argument('xml_filepath', help = 'path of the cwe xml file')
parser.add_argument('javascript_filepath', help = 'path of the carrotsearch.foamtree.js file')
parser.add_argument("--json", help = "export .json file", action = "store_true")
parser.add_argument("--gephi", help = "export node and edge table for gephi visualisation", action = "store_true")
args = parser.parse_args()
2.Read Input and Prepare Output Files
In [9]:
filepath = args.xml_filepath
javascript_path = args.javascript_filepath
"""Obtaining the CWE xml filename from the path"""
CWE_xml = filepath.split(os.sep)[-1]
"""This is the inline data model.
This contains CWE_ID, CWE_NAME and CHILD_OF nodes.
This can be extended to have more details.
"""
CWE_version = CWE_xml[:-4]
CWE_json = CWE_version + ".json"
CWE_html = CWE_version + ".html"
CWE_csv_edgelist = CWE_version + " [Edges].csv"
CWE_csv_nodelist = CWE_version + " [Nodes].csv"
CWE_entries = list()
CWE_tree = ET.parse(filepath)
CWE_root = CWE_tree.getroot()
"""The purpose of this function is to remove the commas in the string that is passed as input.
The need is that some of the CWE names contain commas and when eventually added to a CSV file, they occupy multiple
cells instead of just one. In order to overcome that we pass the CWE name and obtain the "comma-less" version of it.
"""
def remove_commas(string):
comma_free_string = ''
temp_string = string.split(',')
for i in temp_string:
comma_free_string+=i
return comma_free_string
3.CWE Views Tables Parser
CWE_entries contains a list of tuples, each tuple containing cwe id, cwe name and child of xml node.
In [10]:
"""There are currently 2 for loops which write to the data model.
The first one is used to write the weaknesses with parent as 1000 - Research Concepts
CWE_root[0][0] is used to obtain cwe:Views and then under it the first tag that contains Research Concepts and its Weakness Classes
After obtaining the cwe_id, we search for the name in the cwe:Weakness which is CWE_root[2]
"""
for views in CWE_root[0][0]:
for members in views:
if members.tag=="Relationship":
cwe_id = members[3].text
child_of = '1000'
for weakness in CWE_root[2]:
if weakness.attrib['ID']==cwe_id:
cwe_name = remove_commas(weakness.attrib['Name'])
temp_list = list()
temp_list.extend((cwe_id,cwe_name,child_of))
CWE_entries.append(temp_list)
4.CWE Weakness Table Parser
In [2]:
"""This is used to write the Weaknesses onto the data model
CWE_root[2] is used to obtain the cwe:Weakness tag.
related_weakness_parameter[2] is cwe:Relationship_Nature
"""
#CWE_root[2] is Weakness table
for weakness in CWE_root[2]:
if weakness.attrib['Status']!="Deprecated":
for parameter in weakness:
if parameter.tag=="Relationships":
for related_weakness_parameter in parameter:
if related_weakness_parameter[2].text == "ChildOf":
cwe_id= weakness.attrib['ID']
cwe_name= remove_commas(weakness.attrib['Name'])
child_of= related_weakness_parameter[3].text
temp_list = list()
temp_list.extend((cwe_id,cwe_name,child_of))
CWE_entries.append(temp_list)
5.
In [3]:
"""This is used to write the Compound Elements onto the data model
CWE_root[3] is used to obtain the cwe:Compound_Elements tag.
"""
for comp_element in CWE_root[3]:
if comp_element.attrib['Status']!="Deprecated":
for parameter in comp_element:
if parameter.tag=="Relationships":
for relationship_parameter in parameter:
if relationship_parameter[2].text == "ChildOf":
cwe_id= comp_element.attrib['ID']
cwe_name= remove_commas(comp_element.attrib['Name'])
child_of= relationship_parameter[3].text
temp_list = list()
temp_list.extend((cwe_id,cwe_name,child_of))
CWE_entries.append(temp_list)
1.1.Input Output Functions
In [1]:
""" The Foamtree Visualization """
def jsonify(number):
main_list = list()
for row in CWE_entries:
parent_id = str(number)
if row[2]==parent_id:
member_dict = dict()
member_dict["label"] = str(row[0]) + ' - ' + str(row[1])
member_dict["weight"] = 1
member_dict["groups"] = jsonify(row[0])
main_list.append(member_dict)
else:
continue
return main_list
def gephi_export(number):
fe = open(CWE_csv_edgelist, 'w')
HEADe = "Source,Target,Type,id,label,timeset,weight\n"
fe.write(HEADe)
fn = open(CWE_csv_nodelist, 'w')
HEADn = "Id,Label,timeset\n"
fn.write(HEADn)
cweid_id_dict = dict()
cwecid_id_dict[str(number)]='0'
node = '{o1},{o2},{o3}\n'.format(o1='0',o2=number,o3='')
fn.write(node)
id_counter = 1
for row in CWE_entries:
node = '{o1},{o2},{o3}\n'.format(o1=id_counter,o2=row[0],o3='')
fn.write(node)
cweid_id_dict[row[0]]=id_counter
id_counter+=1
type_value = "Directed"
edge_id_counter = 0
for row in CWE_entries:
if row[2] in cweid_id_dict:
source_value = int(cweid_id_dict[row[2]])
target_value = int(cweid_id_dict[row[0]])
edge = '{o1},{o2},{o3},{o4},{o5},{o6},{o7}\n'.format(o1=source_value,o2=target_value,o3=type_value,o4=edge_id_counter,o5='',o6='',o7=1)
fe.write(edge)
edge_id_counter+=1
def createJSON():
with open(CWE_json,'w') as jsonfile:
json.dump(CWE_dict, jsonfile)
jsonfile.close()
def createHTML():
foamtreetemplate1 = '''<!DOCTYPE html>
<html>
<head>
<title>FoamTree Quick Start</title>
<meta charset="utf-8" />
</head>
<body>
<div id="visualization" style="width: 800px; height: 600px"></div>
<script src="'''+ javascript_path +'''"></script>
<script>
window.addEventListener("load", function() {
var foamtree = new CarrotSearchFoamTree({
id: "visualization",
dataObject:'''
foamtreetemplate2 = '''});
});
</script>
</body>
</html>'''
with open(CWE_html,'w') as outputfile:
outputfile.write(foamtreetemplate1)
for line in open(CWE_json):
outputfile.write(line)
outputfile.write(foamtreetemplate2)
6.Main
In [5]:
CWE_dict = dict()
CWE_dict["groups"]=jsonify(1000)
createJSON()
createHTML()
if args.gephi:
gephi_export(1000)
if args.json == False:
os.remove(CWE_json)
In [3]:
from IPython.display import Image
Image(filename='cwe2.12_snapshot1.PNG')
Out[3]:
In [2]:
from IPython.display import Image
Image(filename='cwe1.0_snapshot1.PNG')
Out[2]:
The below displays a sample of the visualization created from the code for the CWE versions mentioned above:
In [9]:
from IPython.display import Image
Image(filename='cwewebsite_reference.PNG')
Out[9]:
In [5]:
from IPython.display import Image
Image(filename='snapshot2_cwe 693.PNG')
Out[5]:
In [6]:
from IPython.display import Image
Image(filename='snapshot3_cwe 284.PNG')
Out[6]:
In [7]:
from IPython.display import Image
Image(filename='snapshot4_cwe 286.PNG')
Out[7]: