Analyse der Webanwendung "PetClinic"

Priorisierung von Umbauarbeiten nach Nutzungsgrad

Einlesen der Nutzungsstatistiken


In [1]:
import pandas as pd

coverage = pd.read_csv("../../notebooks/datasets/jacoco_production_coverage_spring_petclinic.csv")
coverage = coverage[
                ['PACKAGE',
                 'CLASS',
                 'LINE_COVERED',
                 'LINE_MISSED']]
coverage.head()


Out[1]:
PACKAGE CLASS LINE_COVERED LINE_MISSED
0 org.springframework.samples.petclinic PetclinicInitializer 24 0
1 org.springframework.samples.petclinic.model NamedEntity 4 1
2 org.springframework.samples.petclinic.model Specialty 1 0
3 org.springframework.samples.petclinic.model PetType 1 0
4 org.springframework.samples.petclinic.model Vets 0 4

Berechnung zusätzlicher Messwerte und Schlüssel


In [2]:
coverage['lines'] = coverage['LINE_COVERED'] + coverage['LINE_MISSED']
coverage['ratio'] = coverage['LINE_COVERED'] / coverage['lines']
coverage['fqn'] = coverage['PACKAGE'] + "." + coverage['CLASS']
coverage[['fqn', 'ratio']].head()


Out[2]:
fqn ratio
0 org.springframework.samples.petclinic.Petclini... 1.0
1 org.springframework.samples.petclinic.model.Na... 0.8
2 org.springframework.samples.petclinic.model.Sp... 1.0
3 org.springframework.samples.petclinic.model.Pe... 1.0
4 org.springframework.samples.petclinic.model.Vets 0.0

Laden der Daten in die Graphdatenbank


In [3]:
import py2neo
graph = py2neo.Graph()

query="""
    UNWIND {coverage_data} as coverage
    MATCH (t:Type {fqn : coverage.fqn})
    MERGE (t)-[:HAS_MEASURE]->(m)
    SET 
        m:Measure:Coverage, 
        m.ratio = coverage.ratio
    RETURN t.fqn as fqn, m.ratio as ratio
"""
coverage_dict = coverage.to_dict(orient='records')
result = graph.run(query, coverage_data=coverage_dict).data()
pd.DataFrame(result).head()


Out[3]:
fqn ratio
0 org.springframework.samples.petclinic.Petclini... 1.0
1 org.springframework.samples.petclinic.model.Na... 0.8
2 org.springframework.samples.petclinic.model.Sp... 1.0
3 org.springframework.samples.petclinic.model.Vets 0.0
4 org.springframework.samples.petclinic.model.Visit 1.0

Aggregation der Messwerte nach Subdomänen


In [4]:
query = """
MATCH 
  (t:Type)-[:BELONGS_TO]->(s:Subdomain),
  (t)-[:HAS_CHANGE]->(ch:Change),
  (t)-[:HAS_MEASURE]->(co:Coverage)
OPTIONAL MATCH
  (t)-[:HAS_BUG]->(b:BugInstance)
RETURN 
  s.name as ASubdomain,
  COUNT(DISTINCT t) as Types,
  COUNT(DISTINCT ch) as Changes,
  AVG(co.ratio) as Coverage,
  COUNT(DISTINCT b) as Bugs,
  SUM(DISTINCT t.lastMethodLineNumber) as Lines
ORDER BY Coverage ASC, Bugs DESC
"""

Ergebnisse nach Subdomänen


In [5]:
result = pd.DataFrame(graph.run(query).data())
result


Out[5]:
ASubdomain Bugs Changes Coverage Lines Types
0 Vet 0 80 0.170417 324 5
1 Visit 0 96 0.369141 484 6
2 Pet 1 172 0.462776 736 10
3 crossfunctional 4 62 0.518803 309 7
4 Owner 4 103 0.537558 539 4
5 Clinic 0 19 0.888889 115 1
6 Person 0 5 1.000000 53 1
7 Specialty 0 7 1.000000 32 1

Umbenennung nach geläufigen Begriffen


In [6]:
plot_data = result.copy().set_index('ASubdomain')
plot_data = plot_data.rename(
    columns= {
        "Changes" : "Investment",
        "Coverage" : "Utilization",
        "Lines" : "Size"})
plot_data


Out[6]:
Bugs Investment Utilization Size Types
ASubdomain
Vet 0 80 0.170417 324 5
Visit 0 96 0.369141 484 6
Pet 1 172 0.462776 736 10
crossfunctional 4 62 0.518803 309 7
Owner 4 103 0.537558 539 4
Clinic 0 19 0.888889 115 1
Person 0 5 1.000000 53 1
Specialty 0 7 1.000000 32 1

In [7]:
%matplotlib inline
from ausi.portfolio import plot_diagram

Vier-Felder-Matrix zur Priorisierung nach Subdomänen


In [8]:
plot_diagram(plot_data, 'Investment', 'Utilization', 'Size');


Aggregation der Messwerte nach technischen Aspekten


In [9]:
query = """
MATCH 
  (t:Type)-[:IS_A]->(ta:TechnicalAspect),
  (t)-[:HAS_CHANGE]->(ch:Change),
  (t)-[:HAS_MEASURE]->(co:Coverage)
OPTIONAL MATCH
  (t)-[:HAS_BUG]->(b:BugInstance)   
RETURN 
  ta.name as ATechnicalAspect,
  COUNT(DISTINCT t) as Types,
  COUNT(DISTINCT ch) as Investment,
  AVG(co.ratio) as Utilization,
  COUNT(DISTINCT b) as Bugs,
  SUM(DISTINCT t.lastMethodLineNumber) as Size
ORDER BY Utilization ASC, Bugs DESC
"""

Ergebnisse nach technischen Aspekten


In [10]:
result = pd.DataFrame(graph.run(query).data()).set_index('ATechnicalAspect')
result


Out[10]:
Bugs Investment Size Types Utilization
ATechnicalAspect
jdbc 1 164 661 8 0.000000
util 4 32 185 4 0.315972
web 3 162 588 7 0.624752
jpa 0 59 194 4 0.704380
model 1 94 684 9 0.705218
service 0 19 115 1 0.888889
petclinic 0 6 110 1 1.000000

Vier-Felder-Matrix zur Priorisierung nach technischen Aspekten


In [11]:
plot_diagram(result, 'Investment', 'Utilization', 'Size');


Ende Demo