mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Exploring the potential of a Mahout classification system
Date Wed, 01 Jun 2011 23:27:37 GMT

I mean that in two senses.

Mahout can probably be used to build an acceptable classifier.

And that classifier will give you scored outputs that represent the
classifiers best guess at the answer.

I would be willing to bet (up to 25 cents) that the top few categories will
be correct.  I would guess, but would not bet, that the top answer will be
correct more than half the time.

With good interface design this should be very usable.  The classifier
should easily run fast enough to allow you to do a live categorization of
topics as the person types.  That might be bad, but it would be pretty
trippy to watch.

Chapter 16 of Mahout in Action describes how to build a
training/classification/web service pipeline for exactly this sort of thing.
 The source code is available freely under Apache license to purchasers of
the book.  (conflict of interest here ... I may someday get royalties from
that book)

On Wed, Jun 1, 2011 at 3:11 PM, Baker, Tristan <>wrote:

> Can Mahout's classification system be used to classify a customer authored
> problem statement in real time?  I am imagining a system that would
> periodically train a classifier in an offline fashion and then leverage that
> classification index to provide real-time classifications of customer
> problem statements as they are received.  Is this possible?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message