Association Rules and the Apriori Algorithm

Pranav B

Reviews

Editor Rating

User Ratings

Based on 0 reviews

Major Concepts

Articles Home » Predictive and Prescriptive Modeling » Machine Learning » Association Rules and the Apriori Algorithm

Association Rules and the Apriori Algorithm

Association Rule Mining

Association Rule Learning (also called Association Rule Mining) is a common technique used to find associations between many variables

It is often used by grocery stores, retailers, and anyone with a large transactional database.

It's the same way that

Association Rules

Association rules are rules presenting association or correlation between itemsets

Apriori Algorithm

Mine frequent itemsets, association rules, or association using the Apriori algorithm. The Apriori algorithm employs a level-wise search for frequent itemsets.

Example

Association Rule Mining in Python

Read the data and convert each row into a transaction

import pandas as pd

data = pd.read_csv('groceries - groceries.csv', na_values=" ")

data = data.iloc[:, 1:]

transactions = []

for i in range(0, data.shape[0]):

   transactions.append([str(data.values[i,j]) for j in range(0, data.shape[1]) if pd.isna(data.iloc[i,j]) == False] )

Apply apriori algorithm and convert the rules obtained into a list

from apyori import apriori

rules = apriori(transactions, min_support = 0.004, min_confidence = 0.2, min_lift = 3, min_length = 2)



results = list(rules)

Iterate through the results and create a data frame of support, lift, confidence, items, antecedent, consequent, count

results_df = pd.DataFrame(columns=('Items','Antecedent','Consequent','Support','Confidence','Lift'))



Support =[]

Confidence = []

Lift = []

Items = []

Antecedent = []

Consequent=[]



for RelationRecord in results:

   for ordered_stat in RelationRecord.ordered_statistics:

       Support.append(RelationRecord.support)

       Items.append(RelationRecord.items)

       Antecedent.append(ordered_stat.items_base)

       Consequent.append(ordered_stat.items_add)

       Confidence.append(ordered_stat.confidence)

       Lift.append(ordered_stat.lift)



results_df['Items'] = list(map(set, Items))

results_df['Antecedent'] = list(map(set, Antecedent))

results_df['Consequent'] = list(map(set, Consequent))

results_df['Support'] = Support

results_df['Confidence'] = Confidence

results_df['Lift']= Lift



results_df.sort_values(by ='Confidence', ascending = False, inplace = True)

results_df.reset_index(inplace=True, drop = True)

results_df.head()

Output:

                                           	Items  ...  	Lift

0  {root vegetables, other vegetables, citrus fru...  ...  4.060694

1  {root vegetables, other vegetables, citrus fru...  ...  3.273165

2  {pip fruit, other vegetables, root vegetables,...  ...  3.171368

3  {yogurt, other vegetables, root vegetables, tr...  ...  3.165495

4  {pip fruit, other vegetables, whipped/sour cream}  ...  3.123610

[5 rows x 6 columns]

Apply the algorithm and get predictions on unseen data

test_df = pd.read_csv('groceries_test_data.csv', header=None)

test_df = test_df.iloc[:, 1:]

test_list = list(test_df.iloc[135, :])

test_list[0:5]

test_X = test_list[0:2]

predictions = results_df[pd.DataFrame(results_df.Antecedent.tolist()).iloc[:, 0:len(test_X)].isin(test_X).all(axis = 'columns')]

predictions.shape

predictions.reset_index(drop=True, inplace=True)

predictions[['Consequent', "Confidence"]]

Output:

      	Consequent  Confidence

0  {other vegetables}	0.785714

1  {other vegetables}	0.586207

2	{tropical fruit}	0.321839

for code file refer here:https://www.cluzters.ai/vault/274/1030/apriori-algorithm-code

0 Comments

User Reviews

Member Sign In

Member Sign In

Create Account

Reviews

Major Concepts

Association Rules and the Apriori Algorithm

User Reviews

Connect With Us