mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Scholten <fr...@frankscholten.nl>
Subject Re: Question on OnlineLogisticRegression.iris() test case
Date Mon, 06 Jan 2014 21:51:36 GMT
Ah of course. Thanks Ted!

Btw for others who are interested, the online statistical learning class at
Stanford starts in a few weeks:
https://class.stanford.edu/courses/HumanitiesScience/StatLearning/Winter2014/about


On Mon, Jan 6, 2014 at 5:37 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> This is an offset element which allows the model to have an intercept term
> in addition to terms for the predictor variables.
>
>
>
>
> On Mon, Jan 6, 2014 at 8:31 AM, Frank Scholten <frank@frankscholten.nl
> >wrote:
>
> > Hi,
> >
> > I am studying the LR / SGD code and I was wondering why in the iris test
> > case the first element of each vector is set to 1 in the loop parsing the
> > CSV file via v.set(0,1)
> >
> >     for (String line : raw.subList(1, raw.size())) {
> >       // order gets a list of indexes
> >       order.add(order.size());
> >
> >       // parse the predictor variables
> >       Vector v = new DenseVector(5);
> >       v.set(0, 1);
> >       int i = 1;
> >       Iterable<String> values = onComma.split(line);
> >       for (String value : Iterables.limit(values, 4)) {
> >         v.set(i++, Double.parseDouble(value));
> >       }
> >       data.add(v);
> >
> >       // and the target
> >       target.add(dict.intern(Iterables.get(values, 4)));
> >     }
> >
> > If I remove the line the accuracy drops to 92% but I don't know why this
> is
> > happening. Where is this first element used throughout the algorithm?
> >
> > Cheers,
> >
> > Frank
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message