mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Twitter Classification
Date Thu, 21 Jan 2010 23:44:11 GMT

If you want 1 of n categories, then train a single n-way classifier.

If you want k of n, train n binary classifiers or hack the 1 of n classifier
slightly to remove the soft-max function.

If there are a relatively small number of category combinations, train on
each combination as a 1 of n target.

There are some fancier algorithms that make use of the categorical
combinations, but we don't need to worry about those to start.

On Thu, Jan 21, 2010 at 2:55 PM, Jason Rutherglen <> wrote:

> > The SGD and Pegasos classifiers would be ideal for this.
> For multiple categories? From a user perspective, classifying
> into multiple categories would be real sweet because it'd save
> time and be better than how CL behaves today (eg, uni-category).
> Also Solr/Lucene easily supp

Ted Dunning, CTO

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message