flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Alexandrov <alexander.s.alexand...@gmail.com>
Subject Re: MultipleLinearRegression - Strange results
Date Mon, 01 Jun 2015 15:39:33 GMT
I've seen some work on adaptive learning rates in the past days.

Maybe we can think about extending the base algorithm and comparing the use
case setting for the IMPRO-3 project.

@Felix you can discuss this with the others on Wednesday, Manu will be also
there and can give some feedback, I'll try to send a link tomorrow
morning...


2015-06-01 20:33 GMT+10:00 Till Rohrmann <trohrmann@apache.org>:

> Since MLR uses stochastic gradient descent, you probably have to configure
> the step size right. SGD is very sensitive to the right step size choice.
> If the step size is too high, then the SGD algorithm does not converge. You
> can find the parameter description here [1].
>
> Cheers,
> Till
>
> [1]
>
> http://ci.apache.org/projects/flink/flink-docs-master/libs/ml/multiple_linear_regression.html
>
> On Mon, Jun 1, 2015 at 11:48 AM, Felix Neutatz <neutatz@googlemail.com>
> wrote:
>
> > Hi,
> >
> > I want to use MultipleLinearRegression, but I got really strange results.
> > So I tested it with the housing price dataset:
> >
> >
> http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data
> >
> > And here I get negative house prices - even when I use the training set
> as
> > dataset:
> > LabeledVector(-1.1901998613214253E78, DenseVector(1500.0, 2197.0, 2978.0,
> > 1369.0, 1451.0))
> > LabeledVector(-2.7411218018254747E78, DenseVector(4445.0, 4522.0, 4038.0,
> > 4223.0, 4868.0))
> > LabeledVector(-2.688526857613956E78, DenseVector(4522.0, 4038.0, 4351.0,
> > 4129.0, 4617.0))
> > LabeledVector(-1.3075960386971714E78, DenseVector(2001.0, 2059.0, 1992.0,
> > 2008.0, 2504.0))
> > LabeledVector(-1.476238770814297E78, DenseVector(1992.0, 1965.0, 1983.0,
> > 2300.0, 3811.0))
> > LabeledVector(-1.4298128754759792E78, DenseVector(2059.0, 1992.0, 1965.0,
> > 2425.0, 3178.0))
> > ...
> >
> > and a huge squared error:
> > Squared error: 4.799184832395361E159
> >
> > You can find my code here:
> >
> >
> https://github.com/FelixNeutatz/wikiTrends/blob/master/extraction/src/test/io/sanfran/wikiTrends/extraction/flink/Regression.scala
> >
> > Can you help me? What did I do wrong?
> >
> > Thank you for your help,
> > Felix
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message