Have you experimented with it ? For logistic regression at least given
enough iterations/tolerance that you are giving, BFGS in both ways should
converge to same solution....
On Tue, Apr 8, 2014 at 4:19 PM, DB Tsai <dbtsai@stanford.edu> wrote:
> I think mini batch is still useful for LBFGS.
>
> One of the usecases can be initialized the weights by training with
> the smaller subsamples of data using mini batch with LBFGS.
>
> Then we could use the weights trained with mini batch to start another
> training process with full data.
>
> Sincerely,
>
> DB Tsai
> 
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Tue, Apr 8, 2014 at 4:05 PM, Debasish Das <debasish.das83@gmail.com>
> wrote:
> > Yup that's what I expected...LBFGS solver is in the master and gradient
> > computation per RDD is done on each of the workers...
> >
> > This miniBatchFraction is also a heuristic which I don't think makes
> sense
> > for LogisticRegressionWithBFGS...does it ?
> >
> >
> > On Tue, Apr 8, 2014 at 3:44 PM, DB Tsai <dbtsai@stanford.edu> wrote:
> >>
> >> Hi Debasish,
> >>
> >> The LBFGS solver will be in the master like GD solver, and the part
> >> that is parallelized is computing the gradient of each input row, and
> >> summing them up.
> >>
> >> I prefer to make the optimizer plugable instead of adding new
> >> LogisticRegressionWithLBFGS since 98% of the code will be the same.
> >>
> >> Nice to have something like this,
> >>
> >> class LogisticRegression private (
> >> var optimizer: Optimizer)
> >> extends GeneralizedLinearAlgorithm[LogisticRegressionModel]
> >>
> >> The following parameters will be setup in the optimizers, and they
> >> should because they are part of optimization parameters.
> >>
> >> var stepSize: Double,
> >> var numIterations: Int,
> >> var regParam: Double,
> >> var miniBatchFraction: Double
> >>
> >> Xiangrui, what do you think?
> >>
> >> For now, you can use my LBFGS solver by copying and pasting the
> >> LogisticRegressionWithSGD code, and changing the optimizer to LBFGS.
> >>
> >> Sincerely,
> >>
> >> DB Tsai
> >> 
> >> My Blog: https://www.dbtsai.com
> >> LinkedIn: https://www.linkedin.com/in/dbtsai
> >>
> >>
> >> On Tue, Apr 8, 2014 at 9:42 AM, Debasish Das <debasish.das83@gmail.com>
> >> wrote:
> >> > Hi DB,
> >> >
> >> > Are we going to clean up the function:
> >> >
> >> > class LogisticRegressionWithSGD private (
> >> > var stepSize: Double,
> >> > var numIterations: Int,
> >> > var regParam: Double,
> >> > var miniBatchFraction: Double)
> >> > extends GeneralizedLinearAlgorithm[LogisticRegressionModel] with
> >> > Serializable {
> >> >
> >> > val gradient = new LogisticGradient()
> >> > val updater = new SimpleUpdater()
> >> > override val optimizer = new GradientDescent(gradient, updater)
> >> >
> >> > Or add a new one ?
> >> >
> >> > class LogisticRegressionWithBFGS ?
> >> >
> >> > The WithABC is optional since optimizer could be picked up either
> based
> >> > on a
> >> > flag...there are only 3 options for optimizor:
> >> >
> >> > 1. GradientDescent
> >> > 2. Quasi Newton
> >> > 3. Newton
> >> >
> >> > May be we add an enum for optimization type....and then under
> >> > GradientDescent family people can add their variants of SGD....Not
> sure
> >> > if
> >> > ConjugateGradient comes under 1 or 2....may be we need 4 options...
> >> >
> >> > Thanks.
> >> > Deb
> >> >
> >> >
> >> > On Mon, Apr 7, 2014 at 11:23 PM, Debasish Das <
> debasish.das83@gmail.com>
> >> > wrote:
> >> >>
> >> >> I got your checkin....I need to run logistic regression SGD vs BFGS
> for
> >> >> my
> >> >> current usecases but your next checkin will update the logistic
> >> >> regression
> >> >> with LBFGS right ? Are you adding it to regression package as well
?
> >> >>
> >> >> Thanks.
> >> >> Deb
> >> >>
> >> >>
> >> >> On Mon, Apr 7, 2014 at 7:00 PM, DB Tsai <dbtsai@stanford.edu>
wrote:
> >> >>>
> >> >>> Hi guys,
> >> >>>
> >> >>> The latest PR uses Breeze's LBFGS implement which is introduced
by
> >> >>> Xiangrui's sparse input format work in SPARK1212.
> >> >>>
> >> >>> https://github.com/apache/spark/pull/353
> >> >>>
> >> >>> Now, it works with the new sparse framework!
> >> >>>
> >> >>> Any feedback would be greatly appreciated.
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>> Sincerely,
> >> >>>
> >> >>> DB Tsai
> >> >>> 
> >> >>> My Blog: https://www.dbtsai.com
> >> >>> LinkedIn: https://www.linkedin.com/in/dbtsai
> >> >>>
> >> >>>
> >> >>> On Thu, Apr 3, 2014 at 5:02 PM, DB Tsai <dbtsai@alpinenow.com>
> wrote:
> >> >>> >  Forwarded message 
> >> >>> > From: David Hall <dlwh@cs.berkeley.edu>
> >> >>> > Date: Sat, Mar 15, 2014 at 10:02 AM
> >> >>> > Subject: Re: MLLib  Thoughts about refactoring Updater for
LBFGS?
> >> >>> > To: DB Tsai <dbtsai@alpinenow.com>
> >> >>> >
> >> >>> >
> >> >>> > On Fri, Mar 7, 2014 at 10:56 PM, DB Tsai <dbtsai@alpinenow.com>
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> Hi David,
> >> >>> >>
> >> >>> >> Please let me know the version of Breeze that LBFGS can
be
> >> >>> >> serialized,
> >> >>> >> and CachedDiffFunction is builtin in LBFGS once you finish.
I'll
> >> >>> >> update the PR to Spark from using RISO implementation
to Breeze
> >> >>> >> implementation.
> >> >>> >
> >> >>> >
> >> >>> > The current master (0.7SNAPSHOT) has these problems fixed.
> >> >>> >
> >> >>> >>
> >> >>> >>
> >> >>> >> Thanks.
> >> >>> >>
> >> >>> >> Sincerely,
> >> >>> >>
> >> >>> >> DB Tsai
> >> >>> >> Machine Learning Engineer
> >> >>> >> Alpine Data Labs
> >> >>> >> 
> >> >>> >> Web: http://alpinenow.com/
> >> >>> >>
> >> >>> >>
> >> >>> >> On Thu, Mar 6, 2014 at 4:26 PM, David Hall <dlwh@cs.berkeley.edu
> >
> >> >>> >> wrote:
> >> >>> >> > On Thu, Mar 6, 2014 at 4:21 PM, DB Tsai <dbtsai@alpinenow.com>
> >> >>> >> > wrote:
> >> >>> >> >
> >> >>> >> >> Hi David,
> >> >>> >> >>
> >> >>> >> >> I can converge to the same result with your breeze
LBFGS and
> >> >>> >> >> Fortran
> >> >>> >> >> implementations now. Probably, I made some mistakes
when I
> tried
> >> >>> >> >> breeze before. I apologize that I claimed it's
not stable.
> >> >>> >> >>
> >> >>> >> >> See the test case in BreezeLBFGSSuite.scala
> >> >>> >> >> https://github.com/AlpineNow/spark/tree/dbtsaibreezeLBFGS
> >> >>> >> >>
> >> >>> >> >> This is training multinomial logistic regression
against iris
> >> >>> >> >> dataset,
> >> >>> >> >> and both optimizers can train the models with
98% training
> >> >>> >> >> accuracy.
> >> >>> >> >>
> >> >>> >> >
> >> >>> >> > great to hear! There were some bugs in LBFGS about
6 months
> ago,
> >> >>> >> > so
> >> >>> >> > depending on the last time you tried it, it might
indeed have
> >> >>> >> > been
> >> >>> >> > bugged.
> >> >>> >> >
> >> >>> >> >
> >> >>> >> >>
> >> >>> >> >> There are two issues to use Breeze in Spark,
> >> >>> >> >>
> >> >>> >> >> 1) When the gradientSum and lossSum are computed
> distributively
> >> >>> >> >> in
> >> >>> >> >> custom defined DiffFunction which will be passed
into your
> >> >>> >> >> optimizer,
> >> >>> >> >> Spark will complain LBFGS class is not serializable.
In
> >> >>> >> >> BreezeLBFGS.scala, I've to convert RDD to array
to make it
> work
> >> >>> >> >> locally. It should be easy to fix by just having
LBFGS to
> >> >>> >> >> implement
> >> >>> >> >> Serializable.
> >> >>> >> >>
> >> >>> >> >
> >> >>> >> > I'm not sure why Spark should be serializing LBFGS?
Shouldn't
> it
> >> >>> >> > live on
> >> >>> >> > the controller node? Or is this a pernode thing?
> >> >>> >> >
> >> >>> >> > But no problem to make it serializable.
> >> >>> >> >
> >> >>> >> >
> >> >>> >> >>
> >> >>> >> >> 2) Breeze computes redundant gradient and loss.
See the
> >> >>> >> >> following
> >> >>> >> >> log
> >> >>> >> >> from both Fortran and Breeze implementations.
> >> >>> >> >>
> >> >>> >> >
> >> >>> >> > Err, yeah. I should probably have LBFGS do this automatically,
> >> >>> >> > but
> >> >>> >> > there's
> >> >>> >> > a CachedDiffFunction that gets rid of the redundant
> calculations.
> >> >>> >> >
> >> >>> >> >  David
> >> >>> >> >
> >> >>> >> >
> >> >>> >> >>
> >> >>> >> >> Thanks.
> >> >>> >> >>
> >> >>> >> >> Fortran:
> >> >>> >> >> Iteration 1: loss 1.3862943611198926, diff 1.0
> >> >>> >> >> Iteration 0: loss 1.5846343143210866, diff 0.14307193024217352
> >> >>> >> >> Iteration 1: loss 1.1242501524477688, diff 0.29053004039012126
> >> >>> >> >> Iteration 2: loss 1.0930151243303563, diff
> 0.027782962952189336
> >> >>> >> >> Iteration 3: loss 1.054036932835569, diff 0.03566113127440601
> >> >>> >> >> Iteration 4: loss 0.9907956302751622, diff 0.05999907649459571
> >> >>> >> >> Iteration 5: loss 0.9184205380342829, diff 0.07304737423337761
> >> >>> >> >> Iteration 6: loss 0.8259870936519937, diff 0.10064381175132982
> >> >>> >> >> Iteration 7: loss 0.6327447552109574, diff 0.23395293458364716
> >> >>> >> >> Iteration 8: loss 0.5534101162436359, diff 0.1253815427665277
> >> >>> >> >> Iteration 9: loss 0.4045020086612566, diff 0.26907321376758075
> >> >>> >> >> Iteration 10: loss 0.3078824990823728, diff
> 0.23885980452569627
> >> >>> >> >>
> >> >>> >> >> Breeze:
> >> >>> >> >> Iteration 1: loss 1.3862943611198926, diff 1.0
> >> >>> >> >> Mar 6, 2014 3:59:11 PM com.github.fommil.netlib.BLAS
<clinit>
> >> >>> >> >> WARNING: Failed to load implementation from:
> >> >>> >> >> com.github.fommil.netlib.NativeSystemBLAS
> >> >>> >> >> Mar 6, 2014 3:59:11 PM com.github.fommil.netlib.BLAS
<clinit>
> >> >>> >> >> WARNING: Failed to load implementation from:
> >> >>> >> >> com.github.fommil.netlib.NativeRefBLAS
> >> >>> >> >> Iteration 0: loss 1.3862943611198926, diff 0.0
> >> >>> >> >> Iteration 1: loss 1.5846343143210866, diff 0.14307193024217352
> >> >>> >> >> Iteration 2: loss 1.1242501524477688, diff 0.29053004039012126
> >> >>> >> >> Iteration 3: loss 1.1242501524477688, diff 0.0
> >> >>> >> >> Iteration 4: loss 1.1242501524477688, diff 0.0
> >> >>> >> >> Iteration 5: loss 1.0930151243303563, diff
> 0.027782962952189336
> >> >>> >> >> Iteration 6: loss 1.0930151243303563, diff 0.0
> >> >>> >> >> Iteration 7: loss 1.0930151243303563, diff 0.0
> >> >>> >> >> Iteration 8: loss 1.054036932835569, diff 0.03566113127440601
> >> >>> >> >> Iteration 9: loss 1.054036932835569, diff 0.0
> >> >>> >> >> Iteration 10: loss 1.054036932835569, diff 0.0
> >> >>> >> >> Iteration 11: loss 0.9907956302751622, diff
> 0.05999907649459571
> >> >>> >> >> Iteration 12: loss 0.9907956302751622, diff 0.0
> >> >>> >> >> Iteration 13: loss 0.9907956302751622, diff 0.0
> >> >>> >> >> Iteration 14: loss 0.9184205380342829, diff
> 0.07304737423337761
> >> >>> >> >> Iteration 15: loss 0.9184205380342829, diff 0.0
> >> >>> >> >> Iteration 16: loss 0.9184205380342829, diff 0.0
> >> >>> >> >> Iteration 17: loss 0.8259870936519939, diff 0.1006438117513297
> >> >>> >> >> Iteration 18: loss 0.8259870936519939, diff 0.0
> >> >>> >> >> Iteration 19: loss 0.8259870936519939, diff 0.0
> >> >>> >> >> Iteration 20: loss 0.6327447552109576, diff 0.233952934583647
> >> >>> >> >> Iteration 21: loss 0.6327447552109576, diff 0.0
> >> >>> >> >> Iteration 22: loss 0.6327447552109576, diff 0.0
> >> >>> >> >> Iteration 23: loss 0.5534101162436362, diff
> 0.12538154276652747
> >> >>> >> >> Iteration 24: loss 0.5534101162436362, diff 0.0
> >> >>> >> >> Iteration 25: loss 0.5534101162436362, diff 0.0
> >> >>> >> >> Iteration 26: loss 0.40450200866125635, diff
> 0.2690732137675816
> >> >>> >> >> Iteration 27: loss 0.40450200866125635, diff
0.0
> >> >>> >> >> Iteration 28: loss 0.40450200866125635, diff
0.0
> >> >>> >> >> Iteration 29: loss 0.30788249908237314, diff
> 0.23885980452569502
> >> >>> >> >>
> >> >>> >> >> Sincerely,
> >> >>> >> >>
> >> >>> >> >> DB Tsai
> >> >>> >> >> Machine Learning Engineer
> >> >>> >> >> Alpine Data Labs
> >> >>> >> >> 
> >> >>> >> >> Web: http://alpinenow.com/
> >> >>> >> >>
> >> >>> >> >>
> >> >>> >> >> On Wed, Mar 5, 2014 at 2:00 PM, David Hall
> >> >>> >> >> <dlwh@cs.berkeley.edu>
> >> >>> >> >> wrote:
> >> >>> >> >> > On Wed, Mar 5, 2014 at 1:57 PM, DB Tsai
<
> dbtsai@alpinenow.com>
> >> >>> >> >> > wrote:
> >> >>> >> >> >
> >> >>> >> >> >> Hi David,
> >> >>> >> >> >>
> >> >>> >> >> >> On Tue, Mar 4, 2014 at 8:13 PM, dlwh
> >> >>> >> >> >> <david.lw.hall@gmail.com>
> >> >>> >> >> >> wrote:
> >> >>> >> >> >> > I'm happy to help fix any problems.
I've verified at
> points
> >> >>> >> >> >> > that
> >> >>> >> >> >> > the
> >> >>> >> >> >> > implementation gives the exact
same sequence of iterates
> >> >>> >> >> >> > for a
> >> >>> >> >> >> > few
> >> >>> >> >> >> different
> >> >>> >> >> >> > functions (with a particular line
search) as the c port
> of
> >> >>> >> >> >> > lbfgs.
> >> >>> >> >> >> > So
> >> >>> >> >> I'm
> >> >>> >> >> >> a
> >> >>> >> >> >> > little surprised it fails where
Fortran succeeds... but
> >> >>> >> >> >> > only a
> >> >>> >> >> >> > little.
> >> >>> >> >> >> This
> >> >>> >> >> >> > was fixed late last year.
> >> >>> >> >> >> I'm working on a reproducible test case
using breeze vs
> >> >>> >> >> >> fortran
> >> >>> >> >> >> implementation to show the problem I've
run into. The test
> >> >>> >> >> >> will
> >> >>> >> >> >> be
> >> >>> >> >> >> in
> >> >>> >> >> >> one of the test cases in my Spark fork,
is it okay for you
> to
> >> >>> >> >> >> investigate the issue? Or do I need
to make it as a
> >> >>> >> >> >> standalone
> >> >>> >> >> >> test?
> >> >>> >> >> >>
> >> >>> >> >> >
> >> >>> >> >> >
> >> >>> >> >> > Um, as long as it wouldn't be too hard to
pull out.
> >> >>> >> >> >
> >> >>> >> >> >
> >> >>> >> >> >>
> >> >>> >> >> >> Will send you the test later today.
> >> >>> >> >> >>
> >> >>> >> >> >> Thanks.
> >> >>> >> >> >>
> >> >>> >> >> >> Sincerely,
> >> >>> >> >> >>
> >> >>> >> >> >> DB Tsai
> >> >>> >> >> >> Machine Learning Engineer
> >> >>> >> >> >> Alpine Data Labs
> >> >>> >> >> >> 
> >> >>> >> >> >> Web: http://alpinenow.com/
> >> >>> >> >> >>
> >> >>> >> >>
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>
> >> >>
> >> >
> >
> >
>
