hivemall-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Makoto Yui <m...@apache.org>
Subject Re: Early Stopping / FFMs performance
Date Wed, 11 Oct 2017 06:17:54 GMT
Shandi,

- First release (v0.5.0) Nov, 2017
We plan to release the first Apache release in beginning of Nov.
Currently, feature freeze phase except minor patches.

FFM is included but still in beta.

- 2nd release (v0.5.1) Dec, 2017
word2vec and FFM are skipped in the first release and to be included
in the 2nd release in late Dec.

- 3rd release (v0.6) Q1, 2018
xgboost and Multi-nominal logistic regression will be introduced in
the 3rd release in Q1, 2018.
https://github.com/apache/incubator-hivemall/pull/93

Thanks,
Makoto

2017-10-11 2:02 GMT+09:00 Shadi Mari <shadimari@gmail.com>:
> Do you have an anticipated timeframe in order to move from Beta to GA. My
> observation is that hivemall releases are not so often, and so i would like
> to get a clue of your next cycle timeframe.
>
> Many thanks
>
> ________________________________
> From: Makoto Yui <myui@apache.org>
> Sent: Tuesday, October 10, 2017 7:33:37 PM
> To: user@hivemall.incubator.apache.org
> Cc: shadimari@gmail.com
> Subject: Re: Early Stopping / FFMs performance
>
> Hi,
>
> 2017-10-11 1:10 GMT+09:00 Shadi Mari <shadimari@gmail.com>:
>> I am using Criteo 2014 dataset for CTR prediction, which is 45M examples
>> in
>> total. Do you think 8 hours is still resonable training duration given
>> than
>> I am using your  EMR configurations? I never assumed this can be such time
>> consuming.
>
> I don't remember exact number but I took 5 hours or so for my previous
> setting:
> "-iters 10 -factors 4 -feature_hashing 20"
>
> FFM is very computation heavy algorithm and training of FFM takes time.
>
>> I already built a version from the master branch. As per your feedback, i
>> assume FFM implementation can not yet be used in production! correct?
>
> Yes, it's still in beta. Use it at your own risks.
>
> FM implementation is stable and ready for production uses.
>
> Hivemall's FFM support linear term and global bias as in plain FM that
> are not supported in libffm.
> I'm not yet get a satisfied prediction accuracy in the current FFM
> implementation for the Criteo 2014 task.
>
> It may due to the default hyperparameter setting such as learning rate
> and l1/l2 params.
> https://www.kaggle.com/c/criteo-display-ad-challenge/discussion/10555
>
> Thanks,
> Makoto
>
> --
> Makoto YUI <myui AT apache.org>
> Research Engineer, Treasure Data, Inc.
> http://myui.github.io/



-- 
Makoto YUI <myui AT apache.org>
Research Engineer, Treasure Data, Inc.
http://myui.github.io/

Mime
View raw message