mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (Commented) (JIRA)" <>
Subject [jira] [Commented] (MAHOUT-826) Bayes/CBayes classification on a non-existing feature
Date Tue, 10 Jan 2012 09:20:38 GMT


Sean Owen commented on MAHOUT-826:

I have no idea, was just trying to pitch in. This is really Robin's to answer. Is Bayes even
supported anymore? I haven't heard much of anything on it.
> Bayes/CBayes classification on a non-existing feature
> -----------------------------------------------------
>                 Key: MAHOUT-826
>                 URL:
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.5
>            Reporter: Andre-Philippe Paquet
>            Assignee: Robin Anil
>            Priority: Minor
>             Fix For: 0.6
>         Attachments: mahout-826.patch, mahout-826.patch
> (see
> Using CBayes or Bayes, when trying to classify a feature/word that doesn't exist in the
model, instead of returning the default/unknown label, the algorithm returns all labels with
a constant score (ex: 12.386649147018964). After a quick look in CBayesAlgorithm, I found
the problem in the featureWeight function that returns the theta normalized weight even if
the feature didn't have any match (result=0).
> As a fix, I overrided the function in a subclass and return 0 if the weight of the current
feature in the current label is 0. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message