International Conference on Advances in Information Technology - AIT 2012
Author(s) : AKSHAY SHARMA, ALPA RESHAMWALA, DIVYA VINEET, NISHA SHARMA, PARSHWA SHAH, SUNITA MAHAJAN
Sequential pattern mining is an important data mining problem with broad applications. In this paper, we have implemented Apriori a candidate generation algorithm and SPAM (Sequential Pattern Mining) algorithm on Yahoo! Music KDD Cup 2011, which is the annual Data Mining and Knowledge Discovery competition organized by ACM Special Interest Group on Knowledge Discovery and Data Mining, the leading professional organization of data miners.Yahoo! Music has amassed billions of user ratings for musical pieces. When properly analyzed, the raw ratings encode information on how the popularity of songs, albums and artists vary over time and above all, which songs users would like to listen to. Such an analysis introduces new scientific challenges. From these discovered patterns, we can know what patterns or music sequences which are frequently heard and in what order they are recommended. Experimental results have shown that SPAM performs well for large datasets like Yahoo! Music datasetis due to the bitmap representation of the data for efficient counting.