In Carola Lilienthal's talk about architecture and technical debt at Herbstcampus 2017, I was reminded that I wanted to implement some of the examples of her book "Long-lived software systems" (available only in German) with jQAssistant. Especially the visualization of the dependencies between different business domains seems like a great starting point to try out some stuff:
The green connections between the modules show the downward dependencies to other modules and the red one the upward dependencies. This visualization can help you if you want to further modularize your system towards your business or subdomains or to identify unwanted dependencies between modules.
At the same time, I started the Java Type Dependency Analysis and realized that there it is only a smart step to analyze dependencies between business domains. What's missing is the information which type belong to which business domain. We'll find out now!
Once, I've developed an party planning application called DropOver (that didn't go live, but that's another story). We wrote that web application in Java and paid especially attention to structuring the code along the business' subdomain "partying". This led to this package structure that resembles the main parts of the application:
The application's main entry point is a site
for a party including location, time, the site's creator
and so on. A user can comment
on a site as well as add some specific widgets like todo
lists, scheduling
or files
upload and also gets notified by the mail
feature. And there is a special package framework
were all the cross-cutting concerns are placed like the dependency injection configuration or common, technical software elements.
The main point to take away here is that thanks to the alignment of the package structure along the business' subdomain it's easy to determine the business domain for a software entity. It's the 3rd position in the Java package name:
at.dropover.
<subdomain>
.
This information item can easily be used to retrieve the information about the subdomain.
I've built the web application, scanned the software artifact (a standard JAR file that we export for integration testing purposes) with jQAssistant command line tool (with jqassistant.sh scan -f dropover-classesjar
in this case) and started the server (with jqassistant.sh server
). Taking a look in the accompanied Neo4j Browser, we can see the graph that jQAssistant stored in Neo4j. E. g. we can display the relationship between the JAR file and the contained Java types:
In the following, I set up the connection between my Python glue code and the Neo4j database. The query executed lists simply all Java types of the application (respectivley the JAR artifact). As mentioned above, we can also get the information about the subdomain derived from the package name:
In [153]:
import py2neo
import pandas as pd
query="""
MATCH
(:Jar:Archive)-[:CONTAINS]->(type:Type)
RETURN
type.fqn AS type, SPLIT(type.fqn, ".")[2] AS subdomain
"""
graph = py2neo.Graph()
subdomaininfo = pd.DataFrame(graph.run(query).data())
subdomaininfo.head()
Out[153]:
The request returns all the corresponding subdomain for each type. Combined with the approach in Java Type Dependency Analysis, we can now visualize the dependencies between the various subdomains:
In [161]:
import json
query="""
MATCH
(:Jar:Archive)-[:CONTAINS]->
(type:Type)-[:DEPENDS_ON]->(directDependency:Type)
<-[:CONTAINS]-(:Jar:Archive)
RETURN
SPLIT(type.fqn, ".")[2] AS name,
COLLECT(DISTINCT SPLIT(directDependency.fqn, ".")[2]) AS imports
"""
graph = py2neo.Graph()
json_data = graph.run(query).data()
with open ( "vis/flare-imports.json", mode='w') as json_file:
json_file.write(json.dumps(json_data, indent=3))
json_data[:2]
Out[161]:
In the output, we see the dependencies between the various subdomains
I've altered the visualization just a little bit so that we can see bidirectional dependencies as well. Those are green and red at the same time and appear more dominant than unidirectional dependencies.
From the visualization above, we can see that the creator
subdomain is used by Java source code from the subdomains comment
, site
, scheduling
, mail
and framework
. The first four make perfectly sense because if you create one of those content types in the application, they are created by some person (they are "personalized" content). Whereas todo
and files
are user agnostic content types and thus don't have any dependencies on creator
(that's a tricky situation in retrospect). What's could look like a mess are the dependencies from and to framework
. In the pseudo subdomain framework
are some base classes for all the data objects that get persistent in a data store. That explains the outbound dependency of creator
. The inbound dependencies from framework
to creator
are needed for the central dependency injection configuration of the application.
Where it get's interesting is the following visualization of the dependencies of the subdomain site
:
In [152]:
query="""
MATCH
(type:Type)
WHERE
type.fqn STARTS WITH "at.dropover"
WITH DISTINCT type
MATCH
(d1:Domain:Business)<-[:BELONGS_TO]-(type:Type),
(type)-[:DEPENDS_ON*0..1]->(directDependency:Type),
(directDependency)-[:BELONGS_TO]->(d2:Business:Domain)
RETURN d1.name as name, COLLECT(DISTINCT d2.name) as imports
"""
json_data = graph.run(query).data()
import json
with open ( "vis/flare-imports.json", mode='w') as json_file:
json_file.write(json.dumps(json_data, indent=3))
json_data[:2]
Out[152]:
In [111]:
query="""
MATCH
(type:Type)
WHERE
type.fqn STARTS WITH "at.dropover"
WITH DISTINCT type
MATCH
(d1:Domain:Business)<-[:BELONGS_TO]-(type:Type),
(type)-[r:DEPENDS_ON*0..1]->(directDependency:Type),
(directDependency)-[:BELONGS_TO]->(d2:Business:Domain)
RETURN d1.name as name, d2.name, COUNT(r) as number
"""
json_data = graph.run(query).data()
df = pd.DataFrame(json_data)
data = df.to_dict(orient='split')['data']
with open ( "vis/chord_data.json", mode='w') as json_file:
json_file.write(json.dumps(data, indent=3))
data[:5]
Out[111]:
Even if there aren't any package naming conventions, you can identify some structure for example in class names or in your inheritance hierarchy that points you towards your subdomains in the code (if that isn't possible as well: I wrote my Master's thesis about mining cohesive concepts from source code via text mining, so you could use that as well :-D . And at a last resort, you have to do the mapping manually...).
Let's see how this could work by mapping business subdomains to the class names of the Spring PetClinic project.
We also have a list of all types in our application:
In [37]:
import py2neo
import pandas as pd
query="""
MATCH
(:Project)-[:CONTAINS]->(artifact:Artifact)-[:CONTAINS]->(type:Type)
RETURN type.fqn as fqn, type.name as name
"""
graph = py2neo.Graph()
subdomaininfo = pd.DataFrame(graph.run(query).data())
subdomaininfo.head()
Out[37]:
First, let's assume that we have some subdomains of our business domain we know about:
In [38]:
subdomains = ['Owner', 'Pet', 'Visit', 'Vet', 'Specialty', 'Clinic']
In [52]:
def determine_subdomain(name):
for feature in subdomains:
if feature in name:
return feature
return "Framework"
In [53]:
subdomaininfo['subdomain'] = subdomaininfo['name'].apply(determine_subdomain)
subdomaininfo.head()
Out[53]:
In [59]:
query="""
UNWIND {subdomaininfo} as info
MERGE (subdomain:Domain:Business { name: info.subdomain })
WITH info, subdomain
MATCH (n:Type { fqn: info.fqn})
MERGE (n)-[:BELONGS_TO]->(subdomain)
RETURN n.fqn as type_fqn, subdomain.name as subdomain
"""
result = graph.run(query, subdomaininfo=subdomaininfo.to_dict(orient='records')).data()
pd.DataFrame(result).head()
Out[59]:
In [98]:
query="""
MATCH
(:Project)-[:CONTAINS]->(artifact:Artifact)-[:CONTAINS]->(type:Type)
WHERE
// we don't want thgo analyze test artifacts
NOT artifact.type = "test-jar"
WITH DISTINCT type, artifact
MATCH
(d1:Domain:Business)<-[:BELONGS_TO]-(type:Type),
(type)-[r:DEPENDS_ON*0..1]->(directDependency:Type),
(directDependency)-[:BELONGS_TO]->(d2:Business:Domain),
(directDependency)<-[:CONTAINS]-(artifact)
RETURN d1.name as name, d2.name, COUNT(r) as number
"""
json_data = graph.run(query).data()
df = pd.DataFrame(json_data)
df.to_dict(orient='split')
Out[98]:
Like in the simple example, the graph looks now like this:
In [14]:
import pandas as pd
pd.DataFrame(json_data).head()
Out[14]:
In [81]:
query="""
MATCH
(:Project)-[:CONTAINS]->(artifact:Artifact)-[:CONTAINS]->(type:Type)
WHERE
// we don't want to analyze test artifacts
NOT artifact.type = "test-jar"
WITH DISTINCT type, artifact
MATCH
(d1:Domain:Business)<-[:BELONGS_TO]-(type:Type),
(type)-[:DEPENDS_ON*0..1]->(directDependency:Type),
(directDependency)-[:BELONGS_TO]->(d2:Business:Domain),
(directDependency)<-[:CONTAINS]-(artifact)
RETURN d1.name as name, COLLECT(DISTINCT d2.name) as imports
"""
json_data = graph.run(query).data()
import json
with open ( "vis/flare-imports.json", mode='w') as json_file:
json_file.write(json.dumps(json_data, indent=3))
json_data
Out[81]:
In [113]:
query="""
MATCH
(:Project)-[:CONTAINS]->(artifact:Artifact)-[:CONTAINS]->(type:Type)
WHERE
// we don't want to analyze test artifacts
NOT artifact.type = "test-jar"
WITH DISTINCT type, artifact
MATCH
(d1:Domain:Business)<-[:BELONGS_TO]-(type:Type),
(type)-[r:DEPENDS_ON*0..1]->(directDependency:Type),
(directDependency)-[:BELONGS_TO]->(d2:Business:Domain),
(directDependency)<-[:CONTAINS]-(artifact)
RETURN d1.name as name, d2.name, COUNT(r) as number
"""
json_data = graph.run(query).data()
df = pd.DataFrame(json_data)
data = df.to_dict(orient='split')['data']
with open ( "vis/chord_data.json", mode='w') as json_file:
json_file.write(json.dumps(data, indent=3))
data[:5]
Out[113]:
In [2]:
query="""
MATCH
(t1:Type)-[:BELONGS_TO]->(s1:Subdomain),
(t2:Type)-[:BELONGS_TO]->(s2:Subdomain),
(t1)-[:DEPENDS_ON]->(t2)
WHERE s1.name <> s2.name
MERGE (s1)-[:DEPENDS_ON]->(s2)
RETURN s1.name, s2.name
"""
pd.DataFrame(graph.run(query).data()).head()
Additionaly, we get a nice visualization of the dependencies between the various business subdomains that can also be visualized with D3 as described in Analyze Dependencies between Business Subdomains.