mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Anil <robin.a...@gmail.com>
Subject Re: Bayes, NaiveBayes, SplitBayesInput and other questions
Date Sun, 11 Sep 2011 15:53:11 GMT
On Sun, Sep 11, 2011 at 9:16 PM, Sebastian Schelter <ssc@apache.org> wrote:

> On 11.09.2011 17:41, Grant Ingersoll wrote:
> >
> > On Sep 11, 2011, at 10:35 AM, Sebastian Schelter wrote:
> >
> >> On 11.09.2011 16:19, Grant Ingersoll wrote:
> >>
> >>> For instance, how do the labels get associated with the training
> examples?  I see the --labels option, but it isn't clear how it relates to
> the training data.
> >>
> >> The training data must already be labeled, it consists of
> >> <Text,VectorWritable> tuples that represents labeled vectors. The
> >> --labels option specifies which labels (and there what parts of the
> >> training data) to use.
> >>
> >
> > So, it's just used as a filter?
>
> Seems so to me.
>
> Yes its a filter on top of the data. Usually I run into cases where I want
to build a model with just two classes from the whole set.

>  >
> >>
> >> Both naive bayes implementations are based on the same paper, with the
> >> old one still including the text-specific preprocessing.
> >>
> >> --sebastian
> >
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message