Skip to main content
Phil Vishnevsky

Week 12 Prep: Association Rules and Market-Basket Analysis

What Are Association Rules?

At its core, association rule learning is all about finding relationships between items in large collections of data—like customer purchases. It’s used to spot patterns, such as which products are often bought together, by using a set of metrics that help determine how meaningful or useful those patterns actually are.

A typical association rule looks something like this:

{A,B}{C}\{A, B\} \Rightarrow \{C\}

This means that if someone buys items A and B, there's a good chance they'll also buy item C.

To measure how strong or interesting a rule is, we usually look at three key metrics:

  1. Support – This tells us how often the itemset appears in the dataset.
Support(X)=Number of transactions containing XTotal number of transactions\text{Support}(X) = \frac{\text{Number of transactions containing } X}{\text{Total number of transactions}}
  1. Confidence – This measures how likely item Y is to be bought when item X is bought.
Confidence(XY)=P(YX)=Support(XY)Support(X)\text{Confidence}(X \Rightarrow Y) = P(Y|X) = \frac{\text{Support}(X \cup Y)}{\text{Support}(X)}
💡
Confidence is derived from the definition of conditional probability: P(EYEX)=P(EXEY)P(EX)P(E_Y|E_X) = \frac{P(E_X \cap E_Y)}{P(E_X)}. In this case, (XY)(X \cup Y) is the itemset of transactions that contain both XX and YY.
  1. Lift – This shows how much more likely item Y is bought when X is bought, compared to random chance.
Lift(XY)=Support(XY)Support(X)×Support(Y)\text{Lift}(X \Rightarrow Y) = \frac{\text{Support}(X \cup Y)}{\text{Support}(X) \times \text{Support}(Y)}
  • Lift = 1 means there's no relationship.
  • Lift > 1 suggests a positive association (they go together).
  • Lift < 1 suggests a negative association (they rarely go together).

Source: Wikipedia - Association rule learning

Market-Basket Analysis

Market-basket analysis is one of the most well-known uses of association rules. Retailers use it to analyze customer purchase behavior—basically figuring out what products tend to end up in the same shopping cart.

By discovering which products are frequently bought together, businesses can make smarter decisions about things like product placement, promotions, and cross-selling strategies.

Example

There's a famous example that comes up a lot: a grocery store chain supposedly discovered that men who bought diapers on Thursday and Saturday evenings also tended to buy beer. That would translate into a rule like:

{Diapers, Saturday}{Beer}\{\text{Diapers, Saturday}\} \Rightarrow \{\text{Beer}\}

Let's say the data for that rule looks like this:

  • Support: 2% of all transactions include both diapers and beer on Saturdays.
  • Confidence: 60% of the people who bought diapers on Saturday also bought beer.
  • Lift: 3.0, meaning they're three times more likely to buy beer than the average shopper.

So, what would the store do with this info? They might place beer near the diaper aisle to boost sales (even though, they allegedly did not).

Source: Wikipedia - Market basket analysis