Strategic Redesign of the web application "PetClinic"

Prioritizing improvements along the real software usage

Import of usage data


In [1]:
import pandas as pd

coverage = pd.read_csv("../dataset/jacoco.csv")
coverage = coverage[
                ['PACKAGE',
                 'CLASS',
                 'LINE_COVERED',
                 'LINE_MISSED']]
coverage.head()


Out[1]:
PACKAGE CLASS LINE_COVERED LINE_MISSED
0 org.springframework.samples.petclinic PetclinicInitializer 24 0
1 org.springframework.samples.petclinic.model NamedEntity 4 1
2 org.springframework.samples.petclinic.model Specialty 1 0
3 org.springframework.samples.petclinic.model PetType 1 0
4 org.springframework.samples.petclinic.model Vets 0 4

Calculation of additional measures and keys


In [2]:
coverage['lines'] = coverage.LINE_COVERED + coverage.LINE_MISSED
coverage['ratio'] = coverage.LINE_COVERED / coverage.lines
coverage['fqn'] = coverage.PACKAGE + "." + coverage.CLASS
coverage[['fqn', 'ratio']].head()


Out[2]:
fqn ratio
0 org.springframework.samples.petclinic.Petclini... 1.0
1 org.springframework.samples.petclinic.model.Na... 0.8
2 org.springframework.samples.petclinic.model.Sp... 1.0
3 org.springframework.samples.petclinic.model.Pe... 1.0
4 org.springframework.samples.petclinic.model.Vets 0.0

Loading of the data into the graph database


In [3]:
import py2neo
graph = py2neo.Graph()

query="""
    UNWIND {coverage_data} as coverage
    MATCH (t:Type {fqn : coverage.fqn})
    MERGE (t)-[:HAS_MEASURE]->(m)
    SET 
        m:Measure:Coverage, 
        m.ratio = coverage.ratio
    RETURN t.fqn as fqn, m.ratio as ratio
"""
coverage_dict = coverage.to_dict(orient='records')
result = graph.run(query, coverage_data=coverage_dict).data()
pd.DataFrame(result).head()


Out[3]:
fqn ratio
0 org.springframework.samples.petclinic.Petclini... 1.0
1 org.springframework.samples.petclinic.model.Na... 0.8
2 org.springframework.samples.petclinic.model.Sp... 1.0
3 org.springframework.samples.petclinic.model.Pe... 1.0
4 org.springframework.samples.petclinic.model.Vets 0.0

Aggregation of measures along business subdomains


In [4]:
query = """
MATCH 
  (t:Type)-[:BELONGS_TO]->(s:Subdomain),
  (t)-[:HAS_CHANGE]->(ch:Change),
  (t)-[:HAS_MEASURE]->(co:Coverage)
OPTIONAL MATCH
  (t)-[:HAS_BUG]->(b:BugInstance)  
RETURN 
  s.name as ASubdomain,
  COUNT(DISTINCT t) as Types,
  COUNT(DISTINCT ch) as Changes,
  AVG(co.ratio) as Coverage,
  COUNT(DISTINCT b) as Bugs,
  SUM(DISTINCT t.lastMethodLineNumber) as Lines
ORDER BY Coverage ASC, Bugs DESC
"""

Results by business subdomains


In [5]:
re  

sult = pd.DataFrame(graph.data(query))
result


Out[5]:
ASubdomain Bugs Changes Coverage Lines Types
0 Vet 0 75 0.170000 313 5
1 Visit 0 90 0.368056 472 6
2 Pet 1 169 0.490187 746 11
3 Owner 3 96 0.510062 531 4
4 crossfunctional 2 57 0.569953 268 5
5 Clinic 0 26 0.888889 110 1
6 Person 0 5 1.000000 53 1
7 Specialty 0 5 1.000000 28 1

Renaming to business terminology


In [6]:
plot_data = result.copy()
plot_data = plot_data.rename(
    columns= {
        "Changes" : "Investment",
        "Coverage" : "Utilization",
        "Lines" : "Size"})
plot_data = plot_data.set_index('ASubdomain')
plot_data


Out[6]:
Bugs Investment Utilization Size Types
ASubdomain
Vet 0 75 0.170000 313 5
Visit 0 90 0.368056 472 6
Pet 1 169 0.490187 746 11
Owner 3 96 0.510062 531 4
crossfunctional 2 57 0.569953 268 5
Clinic 0 26 0.888889 110 1
Person 0 5 1.000000 53 1
Specialty 0 5 1.000000 28 1

SWOT analysis for prioritizing improvements along subdomains


In [7]:
%matplotlib inline
from ausi import portfolio
portfolio.plot_diagram(plot_data, 'Investment', 'Utilization', 'Size');


Aggregation of measures along technical aspects


In [8]:
query = """
MATCH 
  (t:Type)-[:IS_A]->(ta:TechnicalAspect),
  (t)-[:HAS_CHANGE]->(ch:Change),
  (t)-[:HAS_MEASURE]->(co:Coverage)
OPTIONAL MATCH
  (t)-[:HAS_BUG]->(b:BugInstance)   
RETURN 
  ta.name as ATechnicalAspect,
  COUNT(DISTINCT t) as Types,
  COUNT(DISTINCT ch) as Investment,
  AVG(co.ratio) as Utilization,
  COUNT(DISTINCT b) as Bugs,
  SUM(DISTINCT t.lastMethodLineNumber) as Size
ORDER BY Utilization ASC, Bugs DESC
"""

Results by technical aspects


In [9]:
result = pd.DataFrame(graph.data(query)).set_index('ATechnicalAspect')
result


Out[9]:
Bugs Investment Size Types Utilization
ATechnicalAspect
jdbc 1 153 644 8 0.000000
util 2 27 144 2 0.369919
web 2 157 576 7 0.618831
jpa 0 54 188 4 0.702501
model 1 93 696 10 0.733257
service 0 26 110 1 0.888889
petclinic 0 6 110 1 1.000000

SWOT analysis for prioritizing improvements along technical aspects


In [10]:
portfolio.plot_diagram(result, 'Investment', 'Utilization', 'Size');


End of demo