mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Baker, Tristan" <>
Subject Re: Exploring the potential of a Mahout classification system
Date Wed, 01 Jun 2011 23:43:46 GMT

Thanks for the insight.

In my musings of how the user experience would be designed around an
intelligent classification system, I am thinking that some mechanism that
would allow a user to override the system's decision would help compensate
for any mis-classification.  For this reason, something that classified
correctly more than 50% of the time would likely be fine.

Also, I just pre-ordered "Mahout in Action" through Amazon yesterday but
it doesn't look like it will ship until late July.  How can I read Chapter
16 in the meantime? Is it already available somewhere else?


On 6/1/11 4:27 PM, "Ted Dunning" <> wrote:

>I mean that in two senses.
>Mahout can probably be used to build an acceptable classifier.
>And that classifier will give you scored outputs that represent the
>classifiers best guess at the answer.
>I would be willing to bet (up to 25 cents) that the top few categories
>be correct.  I would guess, but would not bet, that the top answer will be
>correct more than half the time.
>With good interface design this should be very usable.  The classifier
>should easily run fast enough to allow you to do a live categorization of
>topics as the person types.  That might be bad, but it would be pretty
>trippy to watch.
>Chapter 16 of Mahout in Action describes how to build a
>training/classification/web service pipeline for exactly this sort of
> The source code is available freely under Apache license to purchasers of
>the book.  (conflict of interest here ... I may someday get royalties from
>that book)
>On Wed, Jun 1, 2011 at 3:11 PM, Baker, Tristan
>> Can Mahout's classification system be used to classify a customer
>> problem statement in real time?  I am imagining a system that would
>> periodically train a classifier in an offline fashion and then leverage
>> classification index to provide real-time classifications of customer
>> problem statements as they are received.  Is this possible?

View raw message