First we will install packages arules and arulesViz. Note the installation may take a while.
In [10]:
install.packages("arules", rep="http://lib.stat.cmu.edu/R/CRAN/")
install.packages("arulesViz", rep="http://lib.stat.cmu.edu/R/CRAN/")
The downloaded source packages are in
‘/private/var/folders/zm/79bb4c_j6n9_kg23gyhb89hnyhx5dc/T/RtmpxwLbIi/downloaded_packages’
also installing the dependencies ‘gdata’, ‘pkgmaker’, ‘rngtools’, ‘gridBase’, ‘doParallel’, ‘lmtest’, ‘TSP’, ‘gclus’, ‘gplots’, ‘registry’, ‘NMF’, ‘irlba’, ‘scatterplot3d’, ‘vcd’, ‘seriation’, ‘igraph’
The downloaded source packages are in
‘/private/var/folders/zm/79bb4c_j6n9_kg23gyhb89hnyhx5dc/T/RtmpxwLbIi/downloaded_packages’
In [15]:
library('arules')
library('arulesViz')
In [12]:
data(Groceries)
Groceries
Out[12]:
transactions in sparse format with
9835 transactions (rows) and
169 items (columns)
The Groceries dataset contains 9835 transactions and 169 grocery items. Display a summary below.
In [13]:
summary(Groceries)
Out[13]:
transactions as itemMatrix in sparse format with
9835 rows (elements/itemsets/transactions) and
169 columns (items) and a density of 0.02609146
most frequent items:
whole milk other vegetables rolls/buns soda
2513 1903 1809 1715
yogurt (Other)
1372 34055
element (itemset/transaction) length distribution:
sizes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
2159 1643 1299 1005 855 645 545 438 350 246 182 117 78 77 55 46
17 18 19 20 21 22 23 24 26 27 28 29 32
29 14 14 9 11 4 6 1 1 1 1 3 1
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 3.000 4.409 6.000 32.000
includes extended item information - examples:
labels level2 level1
1 frankfurter sausage meet and sausage
2 sausage sausage meet and sausage
3 liver loaf sausage meet and sausage
In [14]:
class(Groceries)
Out[14]:
'transactions'
In [16]:
# display the first 20 grocery labels
Groceries@itemInfo[1:20,]
Out[16]:
labels level2 level1
1 frankfurter sausage meet and sausage
2 sausage sausage meet and sausage
3 liver loaf sausage meet and sausage
4 ham sausage meet and sausage
5 meat sausage meet and sausage
6 finished products sausage meet and sausage
7 organic sausage sausage meet and sausage
8 chicken poultry meet and sausage
9 turkey poultry meet and sausage
10 pork pork meet and sausage
11 beef beef meet and sausage
12 hamburger meat beef meet and sausage
13 fish fish meet and sausage
14 citrus fruit fruit fruit and vegetables
15 tropical fruit fruit fruit and vegetables
16 pip fruit fruit fruit and vegetables
17 grapes fruit fruit and vegetables
18 berries fruit fruit and vegetables
19 nuts/prunes fruit fruit and vegetables
20 root vegetables vegetables fruit and vegetables
In [17]:
# display the 10th to 20th transactions
apply(Groceries@data[,10:20], 2,
function(r) paste(Groceries@itemInfo[r,"labels"], collapse=", ")
)
Out[17]:
- 'whole milk, cereals'
- 'tropical fruit, other vegetables, white bread, bottled water, chocolate'
- 'citrus fruit, tropical fruit, whole milk, butter, curd, yogurt, flour, bottled water, dishes'
- 'beef'
- 'frankfurter, rolls/buns, soda'
- 'chicken, tropical fruit'
- 'butter, sugar, fruit/vegetable juice, newspapers'
- 'fruit/vegetable juice'
- 'packaged fruit/vegetables'
- 'chocolate'
- 'specialty bar'
Next, let's generate some rules from the grocery dataset.
In [18]:
rules <- apriori(Groceries, parameter=list(support=0.001,
confidence=0.6, target = "rules"))
Parameter specification:
confidence minval smax arem aval originalSupport support minlen maxlen target
0.6 0.1 1 none FALSE TRUE 0.001 1 10 rules
ext
FALSE
Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE
apriori - find association rules with the apriori algorithm
version 4.21 (2004.05.09) (c) 1996-2004 Christian Borgelt
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[169 item(s), 9835 transaction(s)] done [0.00s].
sorting and recoding items ... [157 item(s)] done [0.00s].
creating transaction tree ... done [0.01s].
checking subsets of size 1 2 3 4 5 6 done [0.02s].
writing ... [2918 rule(s)] done [0.00s].
creating S4 object ... done [0.01s].
In [19]:
summary(rules)
Out[19]:
set of 2918 rules
rule length distribution (lhs + rhs):sizes
2 3 4 5 6
3 490 1765 626 34
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.000 4.000 4.000 4.068 4.000 6.000
summary of quality measures:
support confidence lift
Min. :0.001017 Min. :0.6000 Min. : 2.348
1st Qu.:0.001118 1st Qu.:0.6316 1st Qu.: 2.668
Median :0.001220 Median :0.6818 Median : 3.168
Mean :0.001480 Mean :0.7028 Mean : 3.450
3rd Qu.:0.001525 3rd Qu.:0.7500 3rd Qu.: 3.692
Max. :0.009354 Max. :1.0000 Max. :18.996
mining info:
data ntransactions support confidence
Groceries 9835 0.001 0.6
In [20]:
plot(rules)
In [21]:
plot(rules@quality)
In [22]:
confidentRules <- rules[quality(rules)$confidence > 0.9]
confidentRules
Out[22]:
set of 127 rules
In [23]:
plot(confidentRules, method="matrix", measure=c("lift", "confidence"),
control=list(reorder=TRUE))
Itemsets in Antecedent (LHS)
[1] "{citrus fruit,other vegetables,soda,fruit/vegetable juice}"
[2] "{tropical fruit,other vegetables,whole milk,yogurt,oil}"
[3] "{tropical fruit,whipped/sour cream,fruit/vegetable juice}"
[4] "{tropical fruit,whole milk,whipped/sour cream,fruit/vegetable juice}"
[5] "{whole milk,butter,whipped/sour cream,soda}"
[6] "{root vegetables,onions,napkins}"
[7] "{hamburger meat,tropical fruit,whipped/sour cream}"
[8] "{root vegetables,whole milk,butter,white bread}"
[9] "{tropical fruit,butter,yogurt,white bread}"
[10] "{yogurt,oil,coffee}"
[11] "{citrus fruit,root vegetables,whole milk,yogurt,whipped/sour cream}"
[12] "{pork,whole milk,butter milk}"
[13] "{tropical fruit,whipped/sour cream,hard cheese}"
[14] "{herbs,whole milk,fruit/vegetable juice}"
[15] "{tropical fruit,root vegetables,whole milk,yogurt,oil}"
[16] "{pip fruit,butter milk,fruit/vegetable juice}"
[17] "{citrus fruit,whole milk,whipped/sour cream,cream cheese }"
[18] "{grapes,onions}"
[19] "{hard cheese,oil}"
[20] "{tropical fruit,yogurt,whipped/sour cream,fruit/vegetable juice}"
[21] "{tropical fruit,dessert,whipped/sour cream}"
[22] "{tropical fruit,whipped/sour cream,soft cheese}"
[23] "{citrus fruit,whole milk,whipped/sour cream,domestic eggs}"
[24] "{citrus fruit,root vegetables,cream cheese }"
[25] "{pip fruit,butter,pastry}"
[26] "{butter,whipped/sour cream,soda}"
[27] "{root vegetables,whole milk,yogurt,rice}"
[28] "{citrus fruit,tropical fruit,root vegetables,whole milk,yogurt}"
[29] "{root vegetables,whole milk,yogurt,oil}"
[30] "{ham,tropical fruit,pip fruit,whole milk}"
[31] "{whole milk,rolls/buns,soda,newspapers}"
[32] "{tropical fruit,butter,whipped/sour cream,fruit/vegetable juice}"
[33] "{tropical fruit,grapes,whole milk,yogurt}"
[34] "{citrus fruit,tropical fruit,root vegetables,whipped/sour cream}"
[35] "{ham,tropical fruit,pip fruit,yogurt}"
[36] "{citrus fruit,root vegetables,soft cheese}"
[37] "{pip fruit,whipped/sour cream,brown bread}"
[38] "{frankfurter,tropical fruit,frozen meals}"
[39] "{tropical fruit,root vegetables,yogurt,oil}"
[40] "{cream cheese ,domestic eggs,napkins}"
[41] "{root vegetables,butter,rice}"
[42] "{tropical fruit,root vegetables,other vegetables,yogurt,oil}"
[43] "{other vegetables,butter,whipped/sour cream,domestic eggs}"
[44] "{root vegetables,other vegetables,yogurt,oil}"
[45] "{pip fruit,root vegetables,hygiene articles}"
[46] "{pip fruit,butter,hygiene articles}"
[47] "{rice,sugar}"
[48] "{root vegetables,whipped/sour cream,flour}"
[49] "{citrus fruit,whipped/sour cream,rolls/buns,pastry}"
[50] "{sausage,tropical fruit,root vegetables,rolls/buns}"
[51] "{butter,soft cheese,domestic eggs}"
[52] "{pip fruit,root vegetables,other vegetables,bottled water}"
[53] "{canned fish,hygiene articles}"
[54] "{root vegetables,other vegetables,butter,white bread}"
[55] "{curd,domestic eggs,sugar}"
[56] "{cream cheese ,domestic eggs,sugar}"
[57] "{pork,other vegetables,butter,whipped/sour cream}"
[58] "{root vegetables,whipped/sour cream,hygiene articles}"
[59] "{other vegetables,cream cheese ,sugar}"
[60] "{sausage,tropical fruit,root vegetables,yogurt}"
[61] "{yogurt,domestic eggs,sugar}"
[62] "{citrus fruit,domestic eggs,sugar}"
[63] "{beef,tropical fruit,yogurt,rolls/buns}"
[64] "{pip fruit,whipped/sour cream,cream cheese }"
[65] "{root vegetables,other vegetables,yogurt,rice}"
[66] "{whipped/sour cream,house keeping products}"
[67] "{pip fruit,root vegetables,other vegetables,brown bread}"
[68] "{root vegetables,whipped/sour cream,sugar}"
[69] "{tropical fruit,butter,yogurt,domestic eggs}"
[70] "{rice,bottled water}"
[71] "{butter,whipped/sour cream,coffee}"
[72] "{tropical fruit,domestic eggs,hygiene articles}"
[73] "{tropical fruit,long life bakery product,napkins}"
[74] "{tropical fruit,butter,hygiene articles}"
[75] "{pip fruit,other vegetables,whipped/sour cream,domestic eggs}"
[76] "{butter,whipped/sour cream,sliced cheese}"
[77] "{root vegetables,whipped/sour cream,soft cheese}"
[78] "{frankfurter,tropical fruit,root vegetables,yogurt}"
[79] "{root vegetables,other vegetables,yogurt,hard cheese}"
[80] "{tropical fruit,curd,yogurt,domestic eggs}"
[81] "{domestic eggs,margarine,fruit/vegetable juice}"
[82] "{citrus fruit,butter,curd}"
[83] "{butter,curd,domestic eggs}"
[84] "{root vegetables,butter,white bread}"
[85] "{soups,bottled beer}"
[86] "{tropical fruit,domestic eggs,sugar}"
[87] "{tropical fruit,other vegetables,whipped/sour cream,domestic eggs}"
[88] "{citrus fruit,tropical fruit,herbs}"
[89] "{tropical fruit,yogurt,whipped/sour cream,domestic eggs}"
[90] "{pip fruit,other vegetables,yogurt,cream cheese }"
[91] "{tropical fruit,root vegetables,rolls/buns,bottled water}"
[92] "{pip fruit,root vegetables,yogurt,fruit/vegetable juice}"
[93] "{root vegetables,butter,yogurt,domestic eggs}"
[94] "{tropical fruit,root vegetables,yogurt,pastry}"
[95] "{tropical fruit,butter,yogurt,sliced cheese}"
[96] "{tropical fruit,root vegetables,herbs,other vegetables}"
[97] "{butter,hygiene articles,napkins}"
[98] "{sausage,pip fruit,cream cheese }"
[99] "{sausage,butter,long life bakery product}"
[100] "{root vegetables,other vegetables,yogurt,waffles}"
[101] "{citrus fruit,other vegetables,yogurt,frozen vegetables}"
[102] "{curd,cereals}"
[103] "{root vegetables,other vegetables,rolls/buns,brown bread}"
[104] "{frankfurter,root vegetables,sliced cheese}"
[105] "{pork,rolls/buns,waffles}"
[106] "{tropical fruit,butter,frozen meals}"
[107] "{citrus fruit,other vegetables,butter,bottled water}"
[108] "{pork,root vegetables,other vegetables,butter}"
[109] "{sausage,berries,butter}"
[110] "{tropical fruit,other vegetables,butter,yogurt,domestic eggs}"
[111] "{tropical fruit,pip fruit,yogurt,frozen meals}"
[112] "{pip fruit,root vegetables,other vegetables,cream cheese }"
[113] "{other vegetables,butter,whipped/sour cream,napkins}"
[114] "{pastry,sweet spreads}"
[115] "{root vegetables,domestic eggs,coffee}"
[116] "{citrus fruit,tropical fruit,other vegetables,domestic eggs}"
[117] "{citrus fruit,tropical fruit,curd,yogurt}"
[118] "{domestic eggs,margarine,bottled beer}"
[119] "{whipped/sour cream,long life bakery product,napkins}"
[120] "{root vegetables,butter,cream cheese }"
[121] "{tropical fruit,other vegetables,butter,white bread}"
[122] "{tropical fruit,whole milk,butter,sliced cheese}"
[123] "{other vegetables,curd,whipped/sour cream,cream cheese }"
[124] "{liquor,red/blush wine}"
Itemsets in Consequent (RHS)
[1] "{other vegetables}" "{root vegetables}" "{bottled beer}"
[4] "{yogurt}" "{whole milk}"
In [24]:
# select the 5 rules with the highest lift
highLiftRules <- head(sort(rules, by="lift"), 5)
In [26]:
highLiftRules
Out[26]:
set of 5 rules
In [27]:
plot(highLiftRules, method="graph", control=list(type="items"))
In [ ]:
Content source: beibeiyang/ipynbdemo
Similar notebooks: