hivemall-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Makoto Yui <m...@apache.org>
Subject Re: Hivemall FFM & Criteo Dataset - LogLess counter
Date Wed, 18 Oct 2017 14:30:46 GMT
Note that Hivemall prints cumulative log loss but libffm return
average loss loss.

[1] https://weichangshuai.files.wordpress.com/2016/05/stat_241.pdf
[2] https://github.com/guestwalk/libffm/blob/d336e88da05356bdf1bcd4482a0846925b2630ce/ffm.cpp#L605

2017-10-18 21:52 GMT+09:00 Shadi Mari <shadimari@gmail.com>:
> As you said earlier in another thread, LibFFM has a built-in feature to do
> L2 instance-wise normalization and probably you are right as most of the
> implementations i have encountered has normalization as a built-in feature.
>
> E.g.
> https://github.com/RTBHOUSE/cuda-ffm/blob/fcda42dfd6914ff881fc503e6bbc4c97d983de5f/src/ffm_trainer.cu
>
> BTW, i was to able to get a logloss of 0.37xxx when testing using LibFFM.
>
> Shadi
>
>
>
>
> On Wed, Oct 18, 2017 at 3:37 PM, Makoto Yui <myui@apache.org> wrote:
>>
>> I guess instance-wise l2 normalization is mandatory for FFM.
>> https://github.com/guestwalk/libffm/blob/master/ffm.cpp#L688
>>
>> https://github.com/CNevd/libffm-ftrl/blob/4247440cc190346daa0b675135e0542e4933cb0f/ffm.cpp#L310
>>
>> Makoto
>>
>> 2017-10-18 21:27 GMT+09:00 Makoto Yui <myui@apache.org>:
>> > At the first update, loss is large but average loss for each update is
>> > very small using your test.
>> >
>> > https://github.com/apache/incubator-hivemall/blob/master/core/src/test/java/hivemall/fm/FieldAwareFactorizationMachineUDTFTest.java#L85
>> >
>> > It might better to implement instance-wise l2 normalization to reduce
>> > initial losses.
>> >
>> > Further investigation is required but I need to focus on the first
>> > Apache release for this month.
>> >
>> > GA of FFM will be v0.5.1 release scheduled on Dec.
>> >
>> > Makoto
>> >
>> > 2017-10-18 1:36 GMT+09:00 Makoto Yui <myui@apache.org>:
>> >> Thanks. I'll test FFM with it tomorrow.
>> >>
>> >> Makoto
>> >>
>> >> 2017-10-18 1:19 GMT+09:00 Shadi Mari <shadimari@gmail.com>:
>> >>> Attached us a sample of 500 examples from my training set represented
>> >>> as
>> >>> vector of features.
>> >>>
>> >>> Regards,
>> >>>
>> >>>
>> >>> On Tue, Oct 17, 2017 at 7:08 PM, Makoto Yui <myui@apache.org>
wrote:
>> >>>>
>> >>>> I need to reproduce your test.
>> >>>>
>> >>>> Could you give me the sample (100~500 examples are enough) of your
>> >>>> training input in gzipped tsv/csv?
>> >>>>
>> >>>> FFM input format is <field>:<index>:<value>.
>> >>>>
>> >>>> Thanks,
>> >>>> Makoto
>> >>>>
>> >>>> 2017-10-18 0:59 GMT+09:00 Shadi Mari <shadimari@gmail.com>:
>> >>>> > Makoto,
>> >>>> >
>> >>>> > I am using the default hyper-parameters in addition to the
>> >>>> > following
>> >>>> > settings:
>> >>>> >
>> >>>> > feature_hashing: 20
>> >>>> > classification is enabled
>> >>>> > Iterations = 10
>> >>>> > K = 2, another test using K = 4
>> >>>> > Opt: FTRL (default)
>> >>>> >
>> >>>> > I tried setting the initial learning to 0.2 and optimizer to
>> >>>> > AdaGrad
>> >>>> > with no
>> >>>> > significant changes on the empirical loss.
>> >>>> >
>> >>>> > Thanks
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > On Tue, Oct 17, 2017 at 6:51 PM, Makoto Yui <myui@apache.org>
>> >>>> > wrote:
>> >>>> >>
>> >>>> >> The empirical loss (cumulative logloss) is too large.
>> >>>> >>
>> >>>> >> The simple test in FieldAwareFactorizationMachineUDTFTest
shows
>> >>>> >> that
>> >>>> >> empirical loss is decreasing properly but it seems optimization
is
>> >>>> >> not
>> >>>> >> working correctly in your case.
>> >>>> >>
>> >>>> >> Could you show me the training hyperparameters?
>> >>>> >>
>> >>>> >> Makoto
>> >>>> >>
>> >>>> >> 2017-10-17 19:01 GMT+09:00 Shadi Mari <shadimari@gmail.com>:
>> >>>> >> > Hello,
>> >>>> >> >
>> >>>> >> > I am trying to understand the results produced by
FFM on each
>> >>>> >> > iteration
>> >>>> >> > during the training of Criteo 2014 dataset.
>> >>>> >> >
>> >>>> >> > Basically, I have 10 mappers running concurrently
(each has
>> >>>> >> > ~4.5M
>> >>>> >> > records),
>> >>>> >> > and follows is an output by one of the mappers:
>> >>>> >> >
>> >>>> >> > -----------------------------
>> >>>> >> >
>> >>>> >> > fm.FactorizationMachineUDTF|: Wrote 4479491 records
to a
>> >>>> >> > temporary
>> >>>> >> > file
>> >>>> >> > for
>> >>>> >> > iterative training: hivemall_fm392724107368114556.sgmt
(2.02
>> >>>> >> > GiB)
>> >>>> >> > Iteration #2 [curLosses=1.5967339372694769E10,
>> >>>> >> > prevLosses=4.182558816480771E10, changeRate=0.6182399322209704,
>> >>>> >> > #trainingExamples=4479491]
>> >>>> >> >
>> >>>> >> > -----------------------------
>> >>>> >> >
>> >>>> >> > Looking at the source code, FFM implementation uses
LogLess
>> >>>> >> > performance
>> >>>> >> > metric when classification is specified, however the
curLossess
>> >>>> >> > counter
>> >>>> >> > is
>> >>>> >> > very high 1.5967339372694769E10
>> >>>> >> >
>> >>>> >> >
>> >>>> >> > What does this mean?
>> >>>> >> >
>> >>>> >> > Regards
>> >>>> >> >
>> >>>> >> >
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> --
>> >>>> >> Makoto YUI <myui AT apache.org>
>> >>>> >> Research Engineer, Treasure Data, Inc.
>> >>>> >> http://myui.github.io/
>> >>>> >
>> >>>> >
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Makoto YUI <myui AT apache.org>
>> >>>> Research Engineer, Treasure Data, Inc.
>> >>>> http://myui.github.io/
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Makoto YUI <myui AT apache.org>
>> >> Research Engineer, Treasure Data, Inc.
>> >> http://myui.github.io/
>> >
>> >
>> >
>> > --
>> > Makoto YUI <myui AT apache.org>
>> > Research Engineer, Treasure Data, Inc.
>> > http://myui.github.io/
>>
>>
>>
>> --
>> Makoto YUI <myui AT apache.org>
>> Research Engineer, Treasure Data, Inc.
>> http://myui.github.io/
>
>



-- 
Makoto YUI <myui AT apache.org>
Research Engineer, Treasure Data, Inc.
http://myui.github.io/

Mime
View raw message