mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Baker, Tristan" <Tristan_Ba...@intuit.com>
Subject Re: Exploring the potential of a Mahout classification system
Date Wed, 01 Jun 2011 23:43:46 GMT
Ted,

Thanks for the insight.

In my musings of how the user experience would be designed around an
intelligent classification system, I am thinking that some mechanism that
would allow a user to override the system's decision would help compensate
for any mis-classification.  For this reason, something that classified
correctly more than 50% of the time would likely be fine.

Also, I just pre-ordered "Mahout in Action" through Amazon yesterday but
it doesn't look like it will ship until late July.  How can I read Chapter
16 in the meantime? Is it already available somewhere else?

Thanks,
Tristan

On 6/1/11 4:27 PM, "Ted Dunning" <ted.dunning@gmail.com> wrote:

>Probably.
>
>I mean that in two senses.
>
>Mahout can probably be used to build an acceptable classifier.
>
>And that classifier will give you scored outputs that represent the
>classifiers best guess at the answer.
>
>I would be willing to bet (up to 25 cents) that the top few categories
>will
>be correct.  I would guess, but would not bet, that the top answer will be
>correct more than half the time.
>
>With good interface design this should be very usable.  The classifier
>should easily run fast enough to allow you to do a live categorization of
>topics as the person types.  That might be bad, but it would be pretty
>trippy to watch.
>
>Chapter 16 of Mahout in Action describes how to build a
>training/classification/web service pipeline for exactly this sort of
>thing.
> The source code is available freely under Apache license to purchasers of
>the book.  (conflict of interest here ... I may someday get royalties from
>that book)
>
>
>On Wed, Jun 1, 2011 at 3:11 PM, Baker, Tristan
><Tristan_Baker@intuit.com>wrote:
>
>> Can Mahout's classification system be used to classify a customer
>>authored
>> problem statement in real time?  I am imagining a system that would
>> periodically train a classifier in an offline fashion and then leverage
>>that
>> classification index to provide real-time classifications of customer
>> problem statements as they are received.  Is this possible?


Mime
View raw message