This is Part 3 to show you how to perform association rules mining by using the R packages arules and aulesViz. In order to test the script, you must complete Part 1 and Part 2.


The Basket Data

In Part 2 Read Transaction Data, we have read the following five shopping basket data into R, of the Transactions class.

f,a,c,d,g,l,m,p

a,b,c,f,l,m,o

b,f,h,j,o

b,c,k,s,p

a,f,c,e,l,p,m,n


Generate the Frequent k-Itemsets

The apriori function implements the Apriori algorithm to create frequent itemsets. By default, the apriori function does all the iteration on every value. Here, is the length of an itemset.

To find the frequent 1-itemsets, we can set a minimum support to 0.5, minlen to 1 and maxlen to 1.The parameter target is frequent itemsets.

The following script will return to itemsets, all the 1-itemsets whose support is at least .

#all the 1-itemsets having at least a support of 0.5
itemsets <- apriori(
  transactions, 
  parameter = list(minlen=1, maxlen=1, support=0.5, target="frequent itemsets")
)

A Summary of the Frequent k-Itemsets

To display a summary of the frequent 1-itemsets, run summary with itemsets.

summary(itemsets)
## set of 7 itemsets
## 
## most frequent items:
##       a       b       c       f       l (Other) 
##       1       1       1       1       1       2 
## 
## element (itemset/transaction) length distribution:sizes
## 1 
## 7 
## 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       1       1       1       1       1       1 
## 
## summary of quality measures:
##     support           count      
##  Min.   :0.6000   Min.   :3.000  
##  1st Qu.:0.6000   1st Qu.:3.000  
##  Median :0.6000   Median :3.000  
##  Mean   :0.6571   Mean   :3.286  
##  3rd Qu.:0.7000   3rd Qu.:3.500  
##  Max.   :0.8000   Max.   :4.000  
## 
## includes transaction ID lists: FALSE 
## 
## mining info:
##          data ntransactions support confidence
##  transactions             5     0.5          1

The summary shows that the support of 1-itemsets ranges from 0.6 to 0.8. The maximum support of 1-itemset is 0.8

The Top-N Frequent k-Itemsets

To print all 1-itemsets in descending order of support,

#print all 1-itemsets in descending order of support
inspect(sort(itemsets, by="support"))
##     items support count
## [1] {f}   0.8     4    
## [2] {c}   0.8     4    
## [3] {b}   0.6     3    
## [4] {p}   0.6     3    
## [5] {a}   0.6     3    
## [6] {m}   0.6     3    
## [7] {l}   0.6     3

Only print the top-5 1-itemsets in descending order of support,

#print top-5 1-itemsets in descending order of support
inspect(head(sort(itemsets, by="support"), 5))
##     items support count
## [1] {f}   0.8     4    
## [2] {c}   0.8     4    
## [3] {b}   0.6     3    
## [4] {p}   0.6     3    
## [5] {a}   0.6     3

Exercise

Write a script which returns all the 2-itemsets whose support is at least , finds the minmum support and maximum support, number of frequent 2-itemsets, and print all the itemsets.