mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chandler Burgess <cburg...@icontrolesi.com>
Subject RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?
Date Fri, 28 Mar 2014 21:09:23 GMT
Ok, then I should remove it? There's about 2 dozen lines of code in TestNaiveBayesDriver for
running sequentially.

-----Original Message-----
From: Suneel Marthi [mailto:suneel_marthi@yahoo.com] 
Sent: Friday, March 28, 2014 3:51 PM
To: dev@mahout.apache.org
Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented
out?

Bayes doesn't have a non-mapreduce impl so -seq flag wouldn't  work. 

Sent from my iPhone

> On Mar 28, 2014, at 4:16 PM, Chandler Burgess <cburgess@icontrolesi.com> wrote:
> 
> Well, maybe someone can correct me but this seems disappointing. I uncommented the code
in NaiveBayesModel, BayesUtil and TrainNaiveBayesJob, added some trace statements in ComplementaryThetaMapper
and ComplementaryNaiveBayesClassifier to verify they were being called, and then ran some
tests using trainnb/testnb. There was not a single difference in the classifications when
train/testcomplementary was specified vs standard naïve bayes.
> 
> Also, running testnb with the -seq flag doesn't appear to work.
> 
> -----Original Message-----
> From: Chandler Burgess [mailto:cburgess@icontrolesi.com]
> Sent: Thursday, March 27, 2014 5:17 PM
> To: dev@mahout.apache.org
> Subject: RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification
commented out?
> 
> The program I wrote didn't use a model that was trained with Cbayes. After looking at
the scorers in SNB and CNB, I figured they would give different results even on a model not
trained with CNB. That could very well be ignorance on my part as to the math. 
> 
> However, I did some command line tests using -c on both training and testing and didn't
see any difference in the testnb output.
> ________________________________________
> From: Suneel Marthi <suneel_marthi@yahoo.com>
> Sent: Thursday, March 27, 2014 5:12 PM
> To: dev@mahout.apache.org
> Cc: ssc@apache.org
> Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification
commented out?
> 
> Just checking , u r testing Cbayes on a model that's already been trained using Cbayes
correct?
> 
> Also the jira I mentioned earlier was fixed for .9, so u should be 
> good. No code changes were done to naive bayes since .9
> 
> 
> Sent from my iPhone
> 
>> On Mar 27, 2014, at 6:01 PM, Chandler Burgess <cburgess@icontrolesi.com> wrote:
>> 
>> Ok, I'll uncomment those lines and see. I also have plenty of test data available
 too (I'm doing document classification with unbalanced classes), so I'll see if it improves
there as well.
>> 
>> Also, I'll try to make some time in the next week and go over the algorithm in detail
compared with the paper as an extra check.
>> 
>> Thanks,
>> Chandler
>> ________________________________________
>> From: Sebastian Schelter <ssc@apache.org>
>> Sent: Thursday, March 27, 2014 4:01 PM
>> To: dev@mahout.apache.org
>> Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification
commented out?
>> 
>> Hi Chandler,
>> 
>> I think a good way to go would be to reenable theta normalization and 
>> run the classification examples that we already have to see how it 
>> affects the result (and make sure it improves the result).
>> 
>> Would be great to have this fixed. I'm also planning to port NB to 
>> our Spark DSL very soon (should be just a few lines of code).
>> 
>> --sebastian
>> 
>> 
>>> On 03/27/2014 09:07 PM, Suneel Marthi wrote:
>>> Which Mahout version r u running? While its true that ThetaNormalizer is still
disabled today, Mahout-1389 fixes a bug wherein Complementary NB wasn't being called when
invoked.
>>> 
>>> Please test with Mahout 0.9 or trunk.
>>> 
>>> 
>>> 
>>> 
>>> On Thursday, March 27, 2014 3:53 PM, Chandler Burgess <cburgess@icontrolesi.com>
wrote:
>>> 
>>> Hello all,
>>> 
>>> It seems Robin Anil hasn't responded, and no one is sure of the status on this.
What needs to be done on this, and/or what can I do to help? I'm no ML expert, but I do have
the paper and should be able to verify/fix the implementation. I'm REALLY interested in using
the CNB classifier, since it seems well suited to the problem I'm trying to tackle, before
I give up and use something else.
>>> 
>>> I've done tests and see no difference when -c is passed on the command line for
training or testing. I also wrote a program to print the scores using StandardNaiveBayesClassifier
and ComplementaryNaiveBayesClassifier in a binary classification problem and see no difference
between the scores, so it seems complementary naïve bayes is completely disabled.
>>> 
>>> Thanks,
>>> Chandler Burgess
>> 
Mime
View raw message