Pattern Mining: Market Basket Analysis

Author

Dr. Thiyanga S. Talagala
Department of Statistics, Faculty of Applied Sciences
University of Sri Jayewardenepura, Sri Lanka

Market Basket Analysis

  • Affinity analysis

  • Unsupervised learning

  • Frequent itemset mining: To discover which groups of products tend to be purchased together.


Basic concepts


Transaction dataset

TID Items
1 i1, i2, i5
2 i2, i4
3 i2, i3
4 i1, i2, i4
5 i1, i3
6 i2, i3
7 i1, i3
8 i1, i2, i3, i5
9 i1, i2, i3

Item set: Set of items


Suppose we have 100 items. Find the total number of itemsets.


Association rule

\[ Milk \Rightarrow Bread \text{ [Support = 2%, Confidence = 60%]}\]

  • IF (Antecedent)

  • THEN (Consequent)

  • Support and Confidence measures the strength of association between antecedent and consequent itemset.


Apriori algorithm

Desired support count: 2 (22%)

Desired confidence: 70%


Step 1:

Translate data into binary incidence matrix format.


Transaction dataset

TID Items
1 i1, i2, i5
2 i2, i4
3 i2, i3
4 i1, i2, i4
5 i1, i3
6 i2, i3
7 i1, i3
8 i1, i2, i3, i5
9 i1, i2, i3

Step 2:

Select itemsets where the minimum support count is 2.


Step 3:

Generate Associate Rules: Compute confidence and lift


Confidence and Lift

In-class demonstration