flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <till.rohrm...@gmail.com>
Subject Re: MultipleLinearRegression - Strange results
Date Tue, 02 Jun 2015 09:31:29 GMT
Great to hear. This should no longer be a pain point once we support proper
cross validation.

On Tue, Jun 2, 2015 at 11:11 AM, Felix Neutatz <neutatz@googlemail.com>
wrote:

> Yes, grid search solved the problem :)
>
> 2015-06-02 11:07 GMT+02:00 Till Rohrmann <till.rohrmann@gmail.com>:
>
> > The SGD algorithm adapts the learning rate accordingly. However, this
> does
> > not help if you choose the initial learning rate too large because then
> you
> > calculate a weight vector in the first iterations from which it takes
> > really long to recover.
> >
> > Cheer,
> > Till
> >
> > On Mon, Jun 1, 2015 at 7:15 PM, Sachin Goel <sachingoel0101@gmail.com>
> > wrote:
> >
> > > You can set the learning rate to be 1/sqrt(iteration number). This
> > usually
> > > works.
> > >
> > > Regards
> > > Sachin Goel
> > >
> > > On Mon, Jun 1, 2015 at 9:09 PM, Alexander Alexandrov <
> > > alexander.s.alexandrov@gmail.com> wrote:
> > >
> > > > I've seen some work on adaptive learning rates in the past days.
> > > >
> > > > Maybe we can think about extending the base algorithm and comparing
> the
> > > use
> > > > case setting for the IMPRO-3 project.
> > > >
> > > > @Felix you can discuss this with the others on Wednesday, Manu will
> be
> > > also
> > > > there and can give some feedback, I'll try to send a link tomorrow
> > > > morning...
> > > >
> > > >
> > > > 2015-06-01 20:33 GMT+10:00 Till Rohrmann <trohrmann@apache.org>:
> > > >
> > > > > Since MLR uses stochastic gradient descent, you probably have to
> > > > configure
> > > > > the step size right. SGD is very sensitive to the right step size
> > > choice.
> > > > > If the step size is too high, then the SGD algorithm does not
> > converge.
> > > > You
> > > > > can find the parameter description here [1].
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> http://ci.apache.org/projects/flink/flink-docs-master/libs/ml/multiple_linear_regression.html
> > > > >
> > > > > On Mon, Jun 1, 2015 at 11:48 AM, Felix Neutatz <
> > neutatz@googlemail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I want to use MultipleLinearRegression, but I got really strange
> > > > results.
> > > > > > So I tested it with the housing price dataset:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data
> > > > > >
> > > > > > And here I get negative house prices - even when I use the
> training
> > > set
> > > > > as
> > > > > > dataset:
> > > > > > LabeledVector(-1.1901998613214253E78, DenseVector(1500.0, 2197.0,
> > > > 2978.0,
> > > > > > 1369.0, 1451.0))
> > > > > > LabeledVector(-2.7411218018254747E78, DenseVector(4445.0, 4522.0,
> > > > 4038.0,
> > > > > > 4223.0, 4868.0))
> > > > > > LabeledVector(-2.688526857613956E78, DenseVector(4522.0, 4038.0,
> > > > 4351.0,
> > > > > > 4129.0, 4617.0))
> > > > > > LabeledVector(-1.3075960386971714E78, DenseVector(2001.0, 2059.0,
> > > > 1992.0,
> > > > > > 2008.0, 2504.0))
> > > > > > LabeledVector(-1.476238770814297E78, DenseVector(1992.0, 1965.0,
> > > > 1983.0,
> > > > > > 2300.0, 3811.0))
> > > > > > LabeledVector(-1.4298128754759792E78, DenseVector(2059.0, 1992.0,
> > > > 1965.0,
> > > > > > 2425.0, 3178.0))
> > > > > > ...
> > > > > >
> > > > > > and a huge squared error:
> > > > > > Squared error: 4.799184832395361E159
> > > > > >
> > > > > > You can find my code here:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/FelixNeutatz/wikiTrends/blob/master/extraction/src/test/io/sanfran/wikiTrends/extraction/flink/Regression.scala
> > > > > >
> > > > > > Can you help me? What did I do wrong?
> > > > > >
> > > > > > Thank you for your help,
> > > > > > Felix
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message