Date of Award

Fall 2002

Document Type

Legacy Thesis

Degree Name

Bachelor of Science (BS)

Department

Computing Sciences

College

College of Science

First Advisor

Jean-Louis Lassez

Abstract/Description

Bioinformatics and the Internet give rise to enormous amounts of data that need to be classified automatically. Existing classification algorithms have high accuracy on small data sets but do not work with larger scale data. MaxSim, a newly proposed algorithm suggests, at the theoretical level, scalability at the expense of some accuracy. The aim here is to implement the algorithm and empirically test this hypothesis. First, data classification and two main data classification algorithms, the Perceptron and the Support Vector Machine (SVM), are introduced. Next, MaxSim is compared to LIBSVM, the prize-winning implementation of the SVM, using benchmark classification data from the University of California at Irvine and the Silicon Graphics web sites. The comparison shows that MaxSim's results are in fact similar in accuracy to the SVM's results and that MaxSim is more efficient in multi-class classification. A discussion of three MaxSim algorithm implementations concludes the evaluation.

Share

COinS