hivemall-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Makoto Yui <m...@apache.org>
Subject Re: Early Stopping / FFMs performance
Date Tue, 10 Oct 2017 16:33:37 GMT
Hi,

2017-10-11 1:10 GMT+09:00 Shadi Mari <shadimari@gmail.com>:
> I am using Criteo 2014 dataset for CTR prediction, which is 45M examples in
> total. Do you think 8 hours is still resonable training duration given than
> I am using your  EMR configurations? I never assumed this can be such time
> consuming.

I don't remember exact number but I took 5 hours or so for my previous setting:
"-iters 10 -factors 4 -feature_hashing 20"

FFM is very computation heavy algorithm and training of FFM takes time.

> I already built a version from the master branch. As per your feedback, i
> assume FFM implementation can not yet be used in production! correct?

Yes, it's still in beta. Use it at your own risks.

FM implementation is stable and ready for production uses.

Hivemall's FFM support linear term and global bias as in plain FM that
are not supported in libffm.
I'm not yet get a satisfied prediction accuracy in the current FFM
implementation for the Criteo 2014 task.

It may due to the default hyperparameter setting such as learning rate
and l1/l2 params.
https://www.kaggle.com/c/criteo-display-ad-challenge/discussion/10555

Thanks,
Makoto

-- 
Makoto YUI <myui AT apache.org>
Research Engineer, Treasure Data, Inc.
http://myui.github.io/

Mime
View raw message