mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Lucene Mahout: Perceptron and Winnow (page created)
Date Mon, 03 Nov 2008 16:37:00 GMT
Perceptron and Winnow (MAHOUT) created by Isabel Drost
   http://cwiki.apache.org/confluence/display/MAHOUT/Perceptron+and+Winnow

Content:
---------------------------------------------------------------------

h1. Classification with Perceptron or Winnow

Both algorithms can are comparably simple linear classifiers. Given training data in some
n-dimensional vector space that is annotated with binary labels the algorithms are guaranteed
to find a linear separating hyperplane if there exists one. In contrast to the Perceptron,
Winnow works only for binary feature vectors.

For more information on the Perceptron see for instance:
http://en.wikipedia.org/wiki/Perceptron

Concise course notes on both algorithms:
http://pages.cs.wisc.edu/~shuchi/courses/787-F07/scribe-notes/lecture24.pdf

Although the algorithms are comparably simple they still work pretty good for text classification
and are fast to train even for huge example sets. In contrast to Naive Bayes they are not
based on the assumption that all features (in the domain of text classification: all terms
in a document) are independent.

h2. Strategy for parallelisation

Currently the strategy for parallelisation is simple: Given there is enough training data,
split the training data. Train the classifier on each split. The resulting hyperplanes are
than averaged.

h2. Roadmap

Currently the patch only contains the code for the classifier itself. It is planned to provide
unit tests and at least one example based on the WebKB dataset by the end of November for
the serial version. After that the parallelisation will be added.

---------------------------------------------------------------------
CONFLUENCE INFORMATION
This message is automatically generated by Confluence

Unsubscribe or edit your notifications preferences
   http://cwiki.apache.org/confluence/users/viewnotifications.action

If you think it was sent incorrectly contact one of the administrators
   http://cwiki.apache.org/confluence/administrators.action

If you want more information on Confluence, or have a bug to report see
   http://www.atlassian.com/software/confluence



Mime
View raw message