AN OVERVIEW OF DATA STRUCTURES AND ALGORITHMS: CASE STUDY OF USE IN THE VECTOR-SPACE MODEL AND MINING OF FREQUENT ITEMSETS USING THE APRIORI ALGORITHM
DOI:
https://doi.org/10.4314/njt.364.1491Keywords:
data structures, algorithms, vector-space model, frequent itemsets mining, apriori algorithm.Abstract
In this paper, we review some commonly used data structures and algorithms. We then review two important problems: the creation of the vector-space model that is widely used in the design of information retrieval systems, and the mining of frequent itemsets using the apriori algorithm. We consider two variations of the apriori algorithm: the first is the classical algorithm which computes candidate k-itemsets by first joining frequent (k-1)-itemsets to themselves, and applying the apriori property to prune the generated candidate k-itemsets; the second avoids the join stage in the classical algorithm, and instead, generates candidate k-itemsets directly from rows of the transactions database, followed by application of the apriori property to prune each itemset so determined. Finally, we illustrate appropriate data structures and algorithms that when put together, provide efficient implementations of our solution to the problems mentioned.
Downloads
Published
Issue
Section
License
The contents of the articles are the sole opinion of the author(s) and not of NIJOTECH.
NIJOTECH allows open access for distribution of the published articles in any media so long as whole (not part) of articles are distributed.
A copyright and statement of originality documents will need to be filled out clearly and signed prior to publication of an accepted article. The Copyright form can be downloaded from http://nijotech.com/downloads/COPYRIGHT%20FORM.pdf while the Statement of Originality is in http://nijotech.com/downloads/Statement%20of%20Originality.pdf
For articles that were developed from funded research, a clear acknowledgement of such support should be mentioned in the article with relevant references. Authors are expected to provide complete information on the sponsorship and intellectual property rights of the article together with all exceptions.
It is forbidden to publish the same research report in more than one journal.