Rule-based Classification

Rules are mutually exclusive
Rules are exhaustive, meaning that a data point must trigger one of the rules to reach the leaf.

Simpify the rules

Take out some of the attributes from a rule, and check if the rule is still accurate or not

After simplification of rules, rules may not be exhaustive anymore, then we need to create a default class.

Sequential Covering Algorithm

Initial rule set is empty
Repeat
- Grow a single rule
- Elminate the instance that are triggered by that rule
- Prune the rule if necessary
Add the new rule to the set of rules

Rule growing

the least frequent class

Rule-based classification softwares

JRIPPER (available in Weka)

Example

Lets say we have five classes: mammalas, reptiles, birds, fish, and amphibians, with the following class frequencies

class	freq
mammals	50
reptiles	100
birds	20
fish	30
amphibians	10

So, we should start with amphibians since ther are the least frequent class, (the hardest class to detetct). We keep the default class to be the most frequent class, i.e. reptiles.

amphibians --> birds --> fish --> mammals (default: reptiles)

At each step, we deal with a binary classification, positive means ??? and negative means evrything else.

Rule growing

Start with {} -> +
Iteratively choose the best conjunct to be added to LHS

Foil Information Gain

Original rule: $R_0: A \rightarrow + $
Extended rule: $R_1: AB \rightarrow + $

$Gain(R_0, R_1) = p_1 \left[\log \frac{p_1}{p_1+n_1} - \log\frac{p_0}{p_0+n_0}\right]$

$p_0$: number of positive instances covered by rule $R_0$
$p_1$: number of positive instances covered by rule $R_1$
$n_0$: number of negative instances covered by rule $R_0$
$n_1$: number of negative instances covered by rule $R_1$

Rationale Gain is higher when the extended rule has high coverage and accuracy.

Why should we remove the instances covered by $R_1$?
Should we also remove the neative classes that are missclasisfied by $R_1$?

Yes

Example

$R_1$: $p_1=12, n_1=3 $
$R_2$: without removing any instance: $p_2 = 8, n_2 = 4 $
$R_2'$: removing only the positive instances $p2' = 6 , n_2' = 4 $
$R_2''$: remove both positive and negative instances $p_2'' = 6 n_2''=2 $