hivemall-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From takuti <...@git.apache.org>
Subject [GitHub] incubator-hivemall issue #149: [WIP][HIVEMALL-201] Evaluate, fix and documen...
Date Tue, 22 May 2018 08:45:04 GMT
Github user takuti commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/149
  
    ### With linear terms
    
    #### Hivemall
    
    ```sql
    INSERT OVERWRITE TABLE criteo.ffm_model
    SELECT
      train_ffm(features, label, '-init_v random -max_init_value 0.5 -classification -iterations
15 -factors 4 -eta 0.2 -l2norm -optimizer adagrad -lambda 0.00002 -cv_rate 0.0')
    FROM (
      SELECT
        features, label
      FROM
        criteo.train_vectorized
      CLUSTER BY rand(1)
    ) t
    ;
    ```
    
    ```
    Iteration #2 | average loss=0.474651712453725, current cumulative loss=753.2722676640616,
previous cumulative loss=990.2550021169766, change rate=0.23931485722999737, #trainingExamples=1587
    Iteration #3 | average loss=0.4499051385165006, current cumulative loss=713.9994548256865,
previous cumulative loss=753.2722676640616, change rate=0.05213627863954456, #trainingExamples=1587
    Iteration #4 | average loss=0.4342257595710771, current cumulative loss=689.1162804392994,
previous cumulative loss=713.9994548256865, change rate=0.03485041090467212, #trainingExamples=1587
    Iteration #5 | average loss=0.4225120903723549, current cumulative loss=670.5266874209271,
previous cumulative loss=689.1162804392994, change rate=0.026975988735198287, #trainingExamples=1587
    Iteration #6 | average loss=0.41300825971798527, current cumulative loss=655.4441081724426,
previous cumulative loss=670.5266874209271, change rate=0.022493630054453533, #trainingExamples=1587
    Iteration #7 | average loss=0.40491514701335013, current cumulative loss=642.6003383101867,
previous cumulative loss=655.4441081724426, change rate=0.019595522641995967, #trainingExamples=1587
    Iteration #8 | average loss=0.3978014571916465, current cumulative loss=631.310912563143,
previous cumulative loss=642.6003383101867, change rate=0.017568347033135524, #trainingExamples=1587
    Iteration #9 | average loss=0.3914067263636397, current cumulative loss=621.1624747390962,
previous cumulative loss=631.310912563143, change rate=0.016075182009517044, #trainingExamples=1587
    Iteration #10 | average loss=0.3855609819906249, current cumulative loss=611.8852784191217,
previous cumulative loss=621.1624747390962, change rate=0.014935216947661086, #trainingExamples=1587
    Iteration #11 | average loss=0.3801467153362753, current cumulative loss=603.2928372386689,
previous cumulative loss=611.8852784191217, change rate=0.01404256889894858, #trainingExamples=1587
    Iteration #12 | average loss=0.3750791243746283, current cumulative loss=595.2505703825351,
previous cumulative loss=603.2928372386689, change rate=0.01333061883005943, #trainingExamples=1587
    Iteration #13 | average loss=0.37029474458756273, current cumulative loss=587.657759660462,
previous cumulative loss=595.2505703825351, change rate=0.012755654676976761, #trainingExamples=1587
    Iteration #14 | average loss=0.36574472099268607, current cumulative loss=580.4368722153928,
previous cumulative loss=587.657759660462, change rate=0.012287572700888608, #trainingExamples=1587
    Iteration #15 | average loss=0.3613904840032808, current cumulative loss=573.5266981132066,
previous cumulative loss=580.4368722153928, change rate=0.011905126005885216, #trainingExamples=1587
    Performed 15 iterations of 1,587 training examples on memory (thus 23,805 training updates
in total)
    ```
    > LogLoss: 0.4771035166468042
    
    #### LIBFFM
    
    ```
    $ ./ffm-train -k 4 -t 15 -l 0.00002 -r 0.2 -s 1 ../tr.sp model
    First check if the text file has already been converted to binary format (0.0 seconds)
    Binary file NOT found. Convert text file to binary file (0.0 seconds)
    iter   tr_logloss      tr_time
       1      0.62043          0.0
       2      0.47533          0.1
       3      0.44968          0.1
       4      0.43548          0.2
       5      0.42261          0.2
       6      0.41322          0.3
       7      0.40489          0.3
       8      0.39687          0.4
       9      0.39085          0.4
      10      0.38530          0.4
      11      0.37965          0.5
      12      0.37450          0.5
      13      0.36937          0.6
      14      0.36444          0.6
      15      0.36031          0.7
    $ ./ffm-predict ../va.sp model submission.csv
    logloss = 0.47818
    ```
    
    ### Without linear terms (i.e., adding `-disable_wi` option)
    
    #### Hivemall
    
    ```
    Iteration #2 | average loss=0.539961924393562, current cumulative loss=856.919574012583,
previous cumulative loss=1651.6985545424677, change rate=0.48118888179934516, #trainingExamples=1587
    Iteration #3 | average loss=0.5106114115327627, current cumulative loss=810.3403101024943,
previous cumulative loss=856.919574012583, change rate=0.05435663430113771, #trainingExamples=1587
    Iteration #4 | average loss=0.4906722901321148, current cumulative loss=778.6969244396662,
previous cumulative loss=810.3403101024943, change rate=0.03904950212686045, #trainingExamples=1587
    Iteration #5 | average loss=0.4754916462118607, current cumulative loss=754.6052425382229,
previous cumulative loss=778.6969244396662, change rate=0.030938457755922362, #trainingExamples=1587
    Iteration #6 | average loss=0.46330291728471334, current cumulative loss=735.2617297308401,
previous cumulative loss=754.6052425382229, change rate=0.025633949669257704, #trainingExamples=1587
    Iteration #7 | average loss=0.453140805287918, current cumulative loss=719.1344579919258,
previous cumulative loss=735.2617297308401, change rate=0.021934055706691043, #trainingExamples=1587
    Iteration #8 | average loss=0.44439540937886607, current cumulative loss=705.2555146842604,
previous cumulative loss=719.1344579919258, change rate=0.019299510895946, #trainingExamples=1587
    Iteration #9 | average loss=0.4366611986545602, current cumulative loss=692.9813222647871,
previous cumulative loss=705.2555146842604, change rate=0.017403894282157387, #trainingExamples=1587
    Iteration #11 | average loss=0.42321511843877446, current cumulative loss=671.6423929623351,
previous cumulative loss=681.8770641514493, change rate=0.015009554840872389, #trainingExamples=1587
    Iteration #12 | average loss=0.4171781468097722, current cumulative loss=662.0617189871085,
previous cumulative loss=671.6423929623351, change rate=0.01426454624606136, #trainingExamples=1587
    Iteration #13 | average loss=0.411451696404218, current cumulative loss=652.973842193494,
previous cumulative loss=662.0617189871085, change rate=0.013726630815504848, #trainingExamples=1587
    Iteration #14 | average loss=0.40595767772793845, current cumulative loss=644.2548345542383,
previous cumulative loss=652.973842193494, change rate=0.013352767103145282, #trainingExamples=1587
    Iteration #15 | average loss=0.4006353270154049, current cumulative loss=635.8082639734475,
previous cumulative loss=644.2548345542383, change rate=0.013110604884532947, #trainingExamples=1587
    Performed 15 iterations of 1,587 training examples on memory (thus 23,805 training updates
in total)
    ```
    > LogLoss: 0.4757278678816663
    
    #### LIBFFFM
    
    ```
    $ ./ffm-train -k 4 -t 15 -l 0.00002 -r 0.2 -s 1 --disable-wi ../tr.sp model
    First check if the text file has already been converted to binary format (0.0 seconds)
    Binary file found. Skip converting text to binary
    iter   tr_logloss      tr_time
       1      1.03199          0.1
       2      0.53894          0.1
       3      0.51018          0.1
       4      0.49096          0.2
       5      0.47549          0.2
       6      0.46334          0.3
       7      0.45313          0.3
       8      0.44405          0.3
       9      0.43662          0.4
      10      0.42985          0.4
      11      0.42337          0.5
      12      0.41732          0.5
      13      0.41140          0.6
      14      0.40583          0.6
      15      0.40049          0.6
    $ ./ffm-predict ../va.sp model submission.csv
    logloss = 0.47284
    ```
    
    FFM w/o linear terms works slightly better in both Hivemall and LIBFFM.


---

Mime
View raw message