In this blog post, I want to show you a graph-based way to split up a class into several independent ones. We take a small example class from Michael Feathers' book "Working effectively with legacy code" and use Neo4j's Awesome Procedures On Cypher (APOC).
Hint: To run the notebook version of this blog post, you need to install the ipython-cypher extension.
class Reservation {
private int duration;
private int dailyRate;
private Date date;
private Customer customer;
private List fees = new ArrayList();
public Reservation(Customer customer, int duration, int dailyRate, Date date) {
this.customer = customer;
this.duration = duration;
this.dailyRate = dailyRate;
this.date = date;
}
public void extend(int additionalDays) {
duration += additionalDays;
}
public void extendForWeek() {
int weekRemainder = RentalCalendar.weekRemainderFor(date);
final int DAYS_PER_WEEK = 7;
extend(weekRemainder);
dailyRate = RateCalculator.computeWeekly(
customer.getRateCode()) / DAYS_PER_WEEK;
}
public void addFee(FeeRider rider) {
fees.add(rider);
}
int getAdditionalFees() {
int total = 0;
for (Iterator it = fees.iterator(); it.hasNext(); ) {
total += ((FeeRider) (it.next())).getAmount();
}
return total;
}
int getPrincipalFee() {
return dailyRate * RateCalculator.rateBase(customer) * duration;
}
public int getTotalFee() {
return getPrincipalFee() + getAdditionalFees();
}
}
In [1]:
%load_ext cypher
In [2]:
%%cypher
MATCH
()-[u:USES]->(),
(n:NewClass)-[s:SHOULD_DECLARE]->()
DELETE u,s,n
Out[2]:
We want to look at the usage dependencies between methods and fields. In the predefined schema, there is a distinction between reading and writing access to fields for each method. We set up a new relationship to signal just the usage of a field of a particular class by adding a new relationshop USES
.
In [3]:
%%cypher
MATCH
(c:Class {name : "Reservation"}),
(c)-[:DECLARES]->(m:Method),
(c)-[:DECLARES]->(f:Field),
(m)-[:READS|WRITES]->(f)
WHERE NOT (m:Constructor)
MERGE (m)-[u:USES]->(f)
RETURN m.name as method, type(u) as relType, f.name as field
Out[3]:
We do the same for the dependency between methods. Here, we can just add the relationship USES
based on the existing INVOKE
relationship type.
In [4]:
%%cypher
MATCH
(c:Class {name : "Reservation"}),
(c)-[:DECLARES]->(m:Method),
(c)-[:DECLARES]->(m2:Method),
(m)-[:INVOKES]->(m2:Method)
WHERE NOT (m:Constructor)
MERGE (m)-[u:USES]->(m2)
RETURN m.name as caller, type(u) as relType, m2.name as callee
Out[4]:
Next, we calculate for each usage of a method
In [5]:
%%cypher
MATCH (m:Method)-[u:USES]->()
WITH m, COUNT(u) as weight
SET m.weight = weight
RETURN m.name as method, weight
Out[5]:
In [6]:
%%cypher
MATCH (m)-[u:USES]->(m2:Method)
WITH m2, COUNT(u) as weight
SET m2.weight = weight
RETURN m2.name as callee, weight
Out[6]:
Now we have to move the information of the called items to the relationship.
In [7]:
%%cypher
MATCH (caller)-[r:USES]->(callee)
SET r.weight = callee.weight
RETURN count(r)
Out[7]:
In [8]:
%%cypher
CALL apoc.algo.community(25,null,'group','USES','OUTGOING','weight',10000)
Out[8]:
In [9]:
%%cypher
MATCH (m:Method)-[:USES]->(f:Field)<-[:USES]-(m2:Method)
WHERE m.group <> m2.group
WITH m.group as newGroupId, m2.group as oldGroupId
MATCH (n:Method) WHERE n.group = oldGroupId
SET n.group = [newGroupId, oldGroupId]
SET n.merged = true
RETURN DISTINCT(n.name), n.group;
Out[9]:
In [10]:
%%cypher
MATCH (m:Method)-[:USES]->(:Field)
WHERE NOT EXISTS(m.merged)
WITH m, m.group as groupId
SET m.merged = false
RETURN m.name, m.group;
Out[10]:
In [11]:
%%cypher
MATCH (m:Method)-[:USES]->(f)
MERGE (c:NewClass { name: m.group})
MERGE (c)-[:SHOULD_DECLARE]->(m)
MERGE (c)-[:SHOULD_DECLARE]->(f)
RETURN c.name as newClass, m.name as method, f.name as field
Out[11]: