mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Naive Bayes score comparison across multiple classifiers
Date Wed, 25 May 2011 17:25:52 GMT
Sorry... didn't see that you had said that.

A thousand training examples per category is very small for NB.  Try the SGD
framework.

Also, with such small data you may prefer to use non-scalable learning
techniques such
as are available in R.

On Wed, May 25, 2011 at 8:38 AM, Ted Dunning <ted.dunning@gmail.com> wrote:

> How much data do you have?
>
> On Wed, May 25, 2011 at 1:14 AM, Jyoti Gupta <jyotigupta.iitd@gmail.com>wrote:
>
>> I have tried that previously but it was not giving good accuracy. Got
>> around
>> 50 % accuracy for 14 categories.
>>
>> On Wed, May 25, 2011 at 1:17 PM, Ted Dunning <ted.dunning@gmail.com>
>> wrote:
>>
>> > Why not just use the multi-class capability of the Naive Bayes
>> > categorizers?
>> >
>> > On Wed, May 25, 2011 at 12:13 AM, Jyoti Gupta <
>> jyotigupta.iitd@gmail.com
>> > >wrote:
>> >
>> > > Hi,
>> > >
>> > > I am using NaiveBayes Classifier to classify my input into one of N
>> > > categories. I am creating N binary classifiers using One vs All
>> approach.
>> > > The train document size is different for each classifier and the
>> > > probability
>> > > of each category is same (1/N).
>> > >
>> > > Can I compare these scores across these classifiers to get a final
>> > category
>> > > as output? Or can you suggest any way to normalize them?
>> > >
>> > > Also, while testing I found that the label returned by the
>> > > ClassifierContext.classify method has lower score value than the other
>> > > label.
>> > > e.g. there are two categories... X and Non-X
>> > > classifier.classify(input) returns (X,score1)
>> > > and classifier.classify(input,2) returns a list [{X,score1},
>> > > {Non-X,score2}]
>> > > Here I found that score1 < score2. I did not go into the
>> implementation
>> > but
>> > > I thought that greater score means greater probability.
>> > >
>> > > Thanks,
>> > > Jyoti
>> > >
>> > >
>> > >
>> > > --
>> > > "Be the change you want to see"
>> > >
>> >
>>
>>
>>
>> --
>> "Be the change you want to see"
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message