mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Question on OnlineLogisticRegression.iris() test case
Date Mon, 06 Jan 2014 16:37:16 GMT
This is an offset element which allows the model to have an intercept term
in addition to terms for the predictor variables.




On Mon, Jan 6, 2014 at 8:31 AM, Frank Scholten <frank@frankscholten.nl>wrote:

> Hi,
>
> I am studying the LR / SGD code and I was wondering why in the iris test
> case the first element of each vector is set to 1 in the loop parsing the
> CSV file via v.set(0,1)
>
>     for (String line : raw.subList(1, raw.size())) {
>       // order gets a list of indexes
>       order.add(order.size());
>
>       // parse the predictor variables
>       Vector v = new DenseVector(5);
>       v.set(0, 1);
>       int i = 1;
>       Iterable<String> values = onComma.split(line);
>       for (String value : Iterables.limit(values, 4)) {
>         v.set(i++, Double.parseDouble(value));
>       }
>       data.add(v);
>
>       // and the target
>       target.add(dict.intern(Iterables.get(values, 4)));
>     }
>
> If I remove the line the accuracy drops to 92% but I don't know why this is
> happening. Where is this first element used throughout the algorithm?
>
> Cheers,
>
> Frank
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message