Demo on Association Rules with R and Jupyter

First we will install packages arules and arulesViz. Note the installation may take a while.


In [10]:
install.packages("arules", rep="http://lib.stat.cmu.edu/R/CRAN/")
install.packages("arulesViz", rep="http://lib.stat.cmu.edu/R/CRAN/")


The downloaded source packages are in
	‘/private/var/folders/zm/79bb4c_j6n9_kg23gyhb89hnyhx5dc/T/RtmpxwLbIi/downloaded_packages’
also installing the dependencies ‘gdata’, ‘pkgmaker’, ‘rngtools’, ‘gridBase’, ‘doParallel’, ‘lmtest’, ‘TSP’, ‘gclus’, ‘gplots’, ‘registry’, ‘NMF’, ‘irlba’, ‘scatterplot3d’, ‘vcd’, ‘seriation’, ‘igraph’

The downloaded source packages are in
	‘/private/var/folders/zm/79bb4c_j6n9_kg23gyhb89hnyhx5dc/T/RtmpxwLbIi/downloaded_packages’

In [15]:
library('arules')
library('arulesViz')

In [12]:
data(Groceries)
Groceries


Out[12]:
transactions in sparse format with
 9835 transactions (rows) and
 169 items (columns)

The Groceries dataset contains 9835 transactions and 169 grocery items. Display a summary below.


In [13]:
summary(Groceries)


Out[13]:
transactions as itemMatrix in sparse format with
 9835 rows (elements/itemsets/transactions) and
 169 columns (items) and a density of 0.02609146 

most frequent items:
      whole milk other vegetables       rolls/buns             soda 
            2513             1903             1809             1715 
          yogurt          (Other) 
            1372            34055 

element (itemset/transaction) length distribution:
sizes
   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
2159 1643 1299 1005  855  645  545  438  350  246  182  117   78   77   55   46 
  17   18   19   20   21   22   23   24   26   27   28   29   32 
  29   14   14    9   11    4    6    1    1    1    1    3    1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.000   2.000   3.000   4.409   6.000  32.000 

includes extended item information - examples:
       labels  level2           level1
1 frankfurter sausage meet and sausage
2     sausage sausage meet and sausage
3  liver loaf sausage meet and sausage

In [14]:
class(Groceries)


Out[14]:
'transactions'

In [16]:
# display the first 20 grocery labels
Groceries@itemInfo[1:20,]


Out[16]:
labelslevel2level1
1frankfurtersausagemeet and sausage
2sausagesausagemeet and sausage
3liver loafsausagemeet and sausage
4hamsausagemeet and sausage
5meatsausagemeet and sausage
6finished productssausagemeet and sausage
7organic sausagesausagemeet and sausage
8chickenpoultrymeet and sausage
9turkeypoultrymeet and sausage
10porkporkmeet and sausage
11beefbeefmeet and sausage
12hamburger meatbeefmeet and sausage
13fishfishmeet and sausage
14citrus fruitfruitfruit and vegetables
15tropical fruitfruitfruit and vegetables
16pip fruitfruitfruit and vegetables
17grapesfruitfruit and vegetables
18berriesfruitfruit and vegetables
19nuts/prunesfruitfruit and vegetables
20root vegetablesvegetablesfruit and vegetables

In [17]:
# display the 10th to 20th transactions
apply(Groceries@data[,10:20], 2, 
      function(r) paste(Groceries@itemInfo[r,"labels"], collapse=", ")
)


Out[17]:
  1. 'whole milk, cereals'
  2. 'tropical fruit, other vegetables, white bread, bottled water, chocolate'
  3. 'citrus fruit, tropical fruit, whole milk, butter, curd, yogurt, flour, bottled water, dishes'
  4. 'beef'
  5. 'frankfurter, rolls/buns, soda'
  6. 'chicken, tropical fruit'
  7. 'butter, sugar, fruit/vegetable juice, newspapers'
  8. 'fruit/vegetable juice'
  9. 'packaged fruit/vegetables'
  10. 'chocolate'
  11. 'specialty bar'

Next, let's generate some rules from the grocery dataset.


In [18]:
rules <- apriori(Groceries, parameter=list(support=0.001,
                                           confidence=0.6, target = "rules"))


Parameter specification:
 confidence minval smax arem  aval originalSupport support minlen maxlen target
        0.6    0.1    1 none FALSE            TRUE   0.001      1     10  rules
   ext
 FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

apriori - find association rules with the apriori algorithm
version 4.21 (2004.05.09)        (c) 1996-2004   Christian Borgelt
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
sorting and recoding items ... [157 item(s)] done [0.00s].
creating transaction tree ... done [0.01s].
checking subsets of size 1 2 3 4 5 6 done [0.02s].
writing ... [2918 rule(s)] done [0.00s].
creating S4 object  ... done [0.01s].

In [19]:
summary(rules)


Out[19]:
set of 2918 rules

rule length distribution (lhs + rhs):sizes
   2    3    4    5    6 
   3  490 1765  626   34 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.000   4.000   4.000   4.068   4.000   6.000 

summary of quality measures:
    support           confidence          lift       
 Min.   :0.001017   Min.   :0.6000   Min.   : 2.348  
 1st Qu.:0.001118   1st Qu.:0.6316   1st Qu.: 2.668  
 Median :0.001220   Median :0.6818   Median : 3.168  
 Mean   :0.001480   Mean   :0.7028   Mean   : 3.450  
 3rd Qu.:0.001525   3rd Qu.:0.7500   3rd Qu.: 3.692  
 Max.   :0.009354   Max.   :1.0000   Max.   :18.996  

mining info:
      data ntransactions support confidence
 Groceries          9835   0.001        0.6

In [20]:
plot(rules)



In [21]:
plot(rules@quality)



In [22]:
confidentRules <- rules[quality(rules)$confidence > 0.9]
confidentRules


Out[22]:
set of 127 rules 

In [23]:
plot(confidentRules, method="matrix", measure=c("lift", "confidence"),
     control=list(reorder=TRUE))


Itemsets in Antecedent (LHS)
  [1] "{citrus fruit,other vegetables,soda,fruit/vegetable juice}"          
  [2] "{tropical fruit,other vegetables,whole milk,yogurt,oil}"             
  [3] "{tropical fruit,whipped/sour cream,fruit/vegetable juice}"           
  [4] "{tropical fruit,whole milk,whipped/sour cream,fruit/vegetable juice}"
  [5] "{whole milk,butter,whipped/sour cream,soda}"                         
  [6] "{root vegetables,onions,napkins}"                                    
  [7] "{hamburger meat,tropical fruit,whipped/sour cream}"                  
  [8] "{root vegetables,whole milk,butter,white bread}"                     
  [9] "{tropical fruit,butter,yogurt,white bread}"                          
 [10] "{yogurt,oil,coffee}"                                                 
 [11] "{citrus fruit,root vegetables,whole milk,yogurt,whipped/sour cream}" 
 [12] "{pork,whole milk,butter milk}"                                       
 [13] "{tropical fruit,whipped/sour cream,hard cheese}"                     
 [14] "{herbs,whole milk,fruit/vegetable juice}"                            
 [15] "{tropical fruit,root vegetables,whole milk,yogurt,oil}"              
 [16] "{pip fruit,butter milk,fruit/vegetable juice}"                       
 [17] "{citrus fruit,whole milk,whipped/sour cream,cream cheese }"          
 [18] "{grapes,onions}"                                                     
 [19] "{hard cheese,oil}"                                                   
 [20] "{tropical fruit,yogurt,whipped/sour cream,fruit/vegetable juice}"    
 [21] "{tropical fruit,dessert,whipped/sour cream}"                         
 [22] "{tropical fruit,whipped/sour cream,soft cheese}"                     
 [23] "{citrus fruit,whole milk,whipped/sour cream,domestic eggs}"          
 [24] "{citrus fruit,root vegetables,cream cheese }"                        
 [25] "{pip fruit,butter,pastry}"                                           
 [26] "{butter,whipped/sour cream,soda}"                                    
 [27] "{root vegetables,whole milk,yogurt,rice}"                            
 [28] "{citrus fruit,tropical fruit,root vegetables,whole milk,yogurt}"     
 [29] "{root vegetables,whole milk,yogurt,oil}"                             
 [30] "{ham,tropical fruit,pip fruit,whole milk}"                           
 [31] "{whole milk,rolls/buns,soda,newspapers}"                             
 [32] "{tropical fruit,butter,whipped/sour cream,fruit/vegetable juice}"    
 [33] "{tropical fruit,grapes,whole milk,yogurt}"                           
 [34] "{citrus fruit,tropical fruit,root vegetables,whipped/sour cream}"    
 [35] "{ham,tropical fruit,pip fruit,yogurt}"                               
 [36] "{citrus fruit,root vegetables,soft cheese}"                          
 [37] "{pip fruit,whipped/sour cream,brown bread}"                          
 [38] "{frankfurter,tropical fruit,frozen meals}"                           
 [39] "{tropical fruit,root vegetables,yogurt,oil}"                         
 [40] "{cream cheese ,domestic eggs,napkins}"                               
 [41] "{root vegetables,butter,rice}"                                       
 [42] "{tropical fruit,root vegetables,other vegetables,yogurt,oil}"        
 [43] "{other vegetables,butter,whipped/sour cream,domestic eggs}"          
 [44] "{root vegetables,other vegetables,yogurt,oil}"                       
 [45] "{pip fruit,root vegetables,hygiene articles}"                        
 [46] "{pip fruit,butter,hygiene articles}"                                 
 [47] "{rice,sugar}"                                                        
 [48] "{root vegetables,whipped/sour cream,flour}"                          
 [49] "{citrus fruit,whipped/sour cream,rolls/buns,pastry}"                 
 [50] "{sausage,tropical fruit,root vegetables,rolls/buns}"                 
 [51] "{butter,soft cheese,domestic eggs}"                                  
 [52] "{pip fruit,root vegetables,other vegetables,bottled water}"          
 [53] "{canned fish,hygiene articles}"                                      
 [54] "{root vegetables,other vegetables,butter,white bread}"               
 [55] "{curd,domestic eggs,sugar}"                                          
 [56] "{cream cheese ,domestic eggs,sugar}"                                 
 [57] "{pork,other vegetables,butter,whipped/sour cream}"                   
 [58] "{root vegetables,whipped/sour cream,hygiene articles}"               
 [59] "{other vegetables,cream cheese ,sugar}"                              
 [60] "{sausage,tropical fruit,root vegetables,yogurt}"                     
 [61] "{yogurt,domestic eggs,sugar}"                                        
 [62] "{citrus fruit,domestic eggs,sugar}"                                  
 [63] "{beef,tropical fruit,yogurt,rolls/buns}"                             
 [64] "{pip fruit,whipped/sour cream,cream cheese }"                        
 [65] "{root vegetables,other vegetables,yogurt,rice}"                      
 [66] "{whipped/sour cream,house keeping products}"                         
 [67] "{pip fruit,root vegetables,other vegetables,brown bread}"            
 [68] "{root vegetables,whipped/sour cream,sugar}"                          
 [69] "{tropical fruit,butter,yogurt,domestic eggs}"                        
 [70] "{rice,bottled water}"                                                
 [71] "{butter,whipped/sour cream,coffee}"                                  
 [72] "{tropical fruit,domestic eggs,hygiene articles}"                     
 [73] "{tropical fruit,long life bakery product,napkins}"                   
 [74] "{tropical fruit,butter,hygiene articles}"                            
 [75] "{pip fruit,other vegetables,whipped/sour cream,domestic eggs}"       
 [76] "{butter,whipped/sour cream,sliced cheese}"                           
 [77] "{root vegetables,whipped/sour cream,soft cheese}"                    
 [78] "{frankfurter,tropical fruit,root vegetables,yogurt}"                 
 [79] "{root vegetables,other vegetables,yogurt,hard cheese}"               
 [80] "{tropical fruit,curd,yogurt,domestic eggs}"                          
 [81] "{domestic eggs,margarine,fruit/vegetable juice}"                     
 [82] "{citrus fruit,butter,curd}"                                          
 [83] "{butter,curd,domestic eggs}"                                         
 [84] "{root vegetables,butter,white bread}"                                
 [85] "{soups,bottled beer}"                                                
 [86] "{tropical fruit,domestic eggs,sugar}"                                
 [87] "{tropical fruit,other vegetables,whipped/sour cream,domestic eggs}"  
 [88] "{citrus fruit,tropical fruit,herbs}"                                 
 [89] "{tropical fruit,yogurt,whipped/sour cream,domestic eggs}"            
 [90] "{pip fruit,other vegetables,yogurt,cream cheese }"                   
 [91] "{tropical fruit,root vegetables,rolls/buns,bottled water}"           
 [92] "{pip fruit,root vegetables,yogurt,fruit/vegetable juice}"            
 [93] "{root vegetables,butter,yogurt,domestic eggs}"                       
 [94] "{tropical fruit,root vegetables,yogurt,pastry}"                      
 [95] "{tropical fruit,butter,yogurt,sliced cheese}"                        
 [96] "{tropical fruit,root vegetables,herbs,other vegetables}"             
 [97] "{butter,hygiene articles,napkins}"                                   
 [98] "{sausage,pip fruit,cream cheese }"                                   
 [99] "{sausage,butter,long life bakery product}"                           
[100] "{root vegetables,other vegetables,yogurt,waffles}"                   
[101] "{citrus fruit,other vegetables,yogurt,frozen vegetables}"            
[102] "{curd,cereals}"                                                      
[103] "{root vegetables,other vegetables,rolls/buns,brown bread}"           
[104] "{frankfurter,root vegetables,sliced cheese}"                         
[105] "{pork,rolls/buns,waffles}"                                           
[106] "{tropical fruit,butter,frozen meals}"                                
[107] "{citrus fruit,other vegetables,butter,bottled water}"                
[108] "{pork,root vegetables,other vegetables,butter}"                      
[109] "{sausage,berries,butter}"                                            
[110] "{tropical fruit,other vegetables,butter,yogurt,domestic eggs}"       
[111] "{tropical fruit,pip fruit,yogurt,frozen meals}"                      
[112] "{pip fruit,root vegetables,other vegetables,cream cheese }"          
[113] "{other vegetables,butter,whipped/sour cream,napkins}"                
[114] "{pastry,sweet spreads}"                                              
[115] "{root vegetables,domestic eggs,coffee}"                              
[116] "{citrus fruit,tropical fruit,other vegetables,domestic eggs}"        
[117] "{citrus fruit,tropical fruit,curd,yogurt}"                           
[118] "{domestic eggs,margarine,bottled beer}"                              
[119] "{whipped/sour cream,long life bakery product,napkins}"               
[120] "{root vegetables,butter,cream cheese }"                              
[121] "{tropical fruit,other vegetables,butter,white bread}"                
[122] "{tropical fruit,whole milk,butter,sliced cheese}"                    
[123] "{other vegetables,curd,whipped/sour cream,cream cheese }"            
[124] "{liquor,red/blush wine}"                                             
Itemsets in Consequent (RHS)
[1] "{other vegetables}" "{root vegetables}"  "{bottled beer}"    
[4] "{yogurt}"           "{whole milk}"      

In [24]:
# select the 5 rules with the highest lift
highLiftRules <- head(sort(rules, by="lift"), 5)

In [26]:
highLiftRules


Out[26]:
set of 5 rules 

In [27]:
plot(highLiftRules, method="graph", control=list(type="items"))



In [ ]: