Apriori Algorithm is an algorithm for data mining of frequent data set and association rule learning over transactional databases. It identifies the frequent individual items in the database for example, collections of items bought by customers. Lets take simple Example
Suppose you have records of large number of transactions at a shopping center as follows:
Transactions | Items bought |
T1 | A, B, C |
T2 | A, B |
T3 | B, C, D |
T4 | A, B, F |
For example in the above table you can see A and B are bought together frequently.
How Does This Algorithm works ?
Step 1 : Count each items occurrence say A came 3, B – 3, C – 2, D – 1, F -1 time.
Step 2 : Remove entries which are very low in count say D,F.
Step 3 : Count Pair wise occurrence and Create table.
_PS : this step we can do for A,B,C or A,B the way clients or we need aggregation.
OP : AB-3, BC-2, AC-1,
Step 4 : Declare highest occurring items as AB.
Clear your Java Concept with this top java books
#include<stdio.h> #include<stdlib.h> int main(int argc, char *argv[]) { FILE *fin; int i,cols,rows,*count; char val; if(argc!=2) { return 1; } fin = fopen(argv[1],"r"); //finding the number of cols cols=0; fscanf(fin,"%c",&val); while(1) { if(val=='\n') break; if(val!=' ') cols++; fscanf(fin,"%c",&val); } printf("\nNumber of columns = %d\n",cols); fclose(fin); fin = fopen(argv[1],"r"); //Generation of 1 item frequent items count = (int*)malloc(sizeof(int)*cols); for(i=0;i<cols;i++) { count[i] =0; } while(!feof(fin)) { for(i=0;i<cols;i++) { fscanf(fin,"%c",&val); if(val=='1') count[i]++; fscanf(fin,"%c",&val); } } //Generation of 1-item frequent sets completed!! printf("\n1-item frequent item sets..\n"); for(i=0;i<cols;i++) { printf("\n%d -> %d",i+1,count[i]); } fclose(fin); return 0; }