Rule-based Classification

  • Rules are mutually exclusive
  • Rules are exhaustive, meaning that a data point must trigger one of the rules to reach the leaf.

Simpify the rules

  • Take out some of the attributes from a rule, and check if the rule is still accurate or not
  • After simplification of rules, rules may not be exhaustive anymore, then we need to create a default class.

Sequential Covering Algorithm

  • Initial rule set is empty
  • Repeat
    • Grow a single rule
    • Elminate the instance that are triggered by that rule
    • Prune the rule if necessary
  • Add the new rule to the set of rules

Rule growing

  • the least frequent class

Rule-based classification softwares

  • JRIPPER (available in Weka)

Example

Lets say we have five classes: mammalas, reptiles, birds, fish, and amphibians, with the following class frequencies

class freq
mammals 50
reptiles 100
birds 20
fish 30
amphibians 10

So, we should start with amphibians since ther are the least frequent class, (the hardest class to detetct). We keep the default class to be the most frequent class, i.e. reptiles.

amphibians --> birds --> fish --> mammals (default: reptiles)

At each step, we deal with a binary classification, positive means ??? and negative means evrything else.

Rule growing
  • Start with {} -> +
  • Iteratively choose the best conjunct to be added to LHS

Foil Information Gain

  • Original rule: $R_0: A \rightarrow + $
  • Extended rule: $R_1: AB \rightarrow + $

    $Gain(R_0, R_1) = p_1 \left[\log \frac{p_1}{p_1+n_1} - \log\frac{p_0}{p_0+n_0}\right]$

    $p_0$: number of positive instances covered by rule $R_0$
    $p_1$: number of positive instances covered by rule $R_1$
    $n_0$: number of negative instances covered by rule $R_0$
    $n_1$: number of negative instances covered by rule $R_1$

Rationale Gain is higher when the extended rule has high coverage and accuracy.

  • Why should we remove the instances covered by $R_1$?

  • Should we also remove the neative classes that are missclasisfied by $R_1$?

    Yes

Example

$R_1$: $p_1=12, n_1=3 $
$R_2$: without removing any instance: $p_2 = 8, n_2 = 4 $
$R_2'$: removing only the positive instances $p2' = 6 , n_2' = 4 $
$R_2''$: remove both positive and negative instances $p_2'' = 6 n_2''=2 $