Sorry... didn't see that you had said that.
A thousand training examples per category is very small for NB. Try the SGD
framework.
Also, with such small data you may prefer to use nonscalable learning
techniques such
as are available in R.
On Wed, May 25, 2011 at 8:38 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
> How much data do you have?
>
> On Wed, May 25, 2011 at 1:14 AM, Jyoti Gupta <jyotigupta.iitd@gmail.com>wrote:
>
>> I have tried that previously but it was not giving good accuracy. Got
>> around
>> 50 % accuracy for 14 categories.
>>
>> On Wed, May 25, 2011 at 1:17 PM, Ted Dunning <ted.dunning@gmail.com>
>> wrote:
>>
>> > Why not just use the multiclass capability of the Naive Bayes
>> > categorizers?
>> >
>> > On Wed, May 25, 2011 at 12:13 AM, Jyoti Gupta <
>> jyotigupta.iitd@gmail.com
>> > >wrote:
>> >
>> > > Hi,
>> > >
>> > > I am using NaiveBayes Classifier to classify my input into one of N
>> > > categories. I am creating N binary classifiers using One vs All
>> approach.
>> > > The train document size is different for each classifier and the
>> > > probability
>> > > of each category is same (1/N).
>> > >
>> > > Can I compare these scores across these classifiers to get a final
>> > category
>> > > as output? Or can you suggest any way to normalize them?
>> > >
>> > > Also, while testing I found that the label returned by the
>> > > ClassifierContext.classify method has lower score value than the other
>> > > label.
>> > > e.g. there are two categories... X and NonX
>> > > classifier.classify(input) returns (X,score1)
>> > > and classifier.classify(input,2) returns a list [{X,score1},
>> > > {NonX,score2}]
>> > > Here I found that score1 < score2. I did not go into the
>> implementation
>> > but
>> > > I thought that greater score means greater probability.
>> > >
>> > > Thanks,
>> > > Jyoti
>> > >
>> > >
>> > >
>> > > 
>> > > "Be the change you want to see"
>> > >
>> >
>>
>>
>>
>> 
>> "Be the change you want to see"
>>
>
>
