mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Querry regarding use of classifier in Mahout
Date Mon, 18 Oct 2010 18:41:19 GMT
Remember it is on the training data!

Naive Bayes classifiers have the property that they overfit massively but
still give good results on held out data.  Thus,
when tested on the same data that they trained with, they demonstrate
results that are unrealistically good.

This is still an important thing to look at.  It just isn't really 200 times
lower error rate than any other result ever reported on this dataset.

On Mon, Oct 18, 2010 at 11:26 AM, JAGANADH G <jaganadhg@gmail.com> wrote:

> >> > Correctly Classified Instances          :       1995     99.75%
> >> > Incorrectly Classified Instances        :          5      0.25%
> >> > Total Classified Instances              :       2000
> >> >
> >> > =======================================================
> >> > Confusion Matrix
> >> > -------------------------------------------------------
> >> > a     b     <--Classified as
> >> > 995   5     |  1000   a     = pos
> >> > 0     1000  |  1000   b     = neg
> >> > Default Category: unknown: 2
> >> >
> >> >
> >> > With some pruning, you will have a decent enough classifier for
> >> sentiments
>
>
> Wow this is an amazing result :-)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message