mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Inconsistent recommendations
Date Tue, 30 Jun 2009 17:27:51 GMT

Hello,


This jittering and your sample output look good, and intuitively makes sense, though it looks
like I'm not interpreting your recipe correctly.

Here is the code snippet that pretends to be looping through top 20 recommendations:

    // exp(-n/5) + rexp() * 0.1
    for (int i=1; i < 20; i++) {
      float exp = (float) i / 5;                                    // not sure why you used
-i /n
      float rexp = (float) Math.log(i-Math.random());   // tried with 1 instead of i like
you said, too
      float rank = exp + rexp * 0.1f;
      float round = Math.round(rank);
      System.out.println("EXP: " + exp + "\tREXP: " + rexp + "\RANK: " + rank + "\tROUND:
" + round);
    }

But the output doesn't quite look like yours, so I must be misinterpreting something.

EXP: 0.2        REXP: -0.59645164       RANK: 0.14035484        ROUND: 0.0
EXP: 0.4        REXP: 0.48764116        RANK: 0.44876412        ROUND: 0.0
EXP: 0.6        REXP: 0.89331275        RANK: 0.6893313 ROUND: 1.0
EXP: 0.8        REXP: 1.2796263 RANK: 0.92796266        ROUND: 1.0
EXP: 1.0        REXP: 1.5976489 RANK: 1.1597649 ROUND: 1.0
EXP: 1.2        REXP: 1.6399297 RANK: 1.363993  ROUND: 1.0
EXP: 1.4        REXP: 1.8479612 RANK: 1.5847961 ROUND: 2.0
EXP: 1.6        REXP: 1.9524398 RANK: 1.795244  ROUND: 2.0
EXP: 1.8        REXP: 2.0999322 RANK: 2.009993  ROUND: 2.0
EXP: 2.0        REXP: 2.218352  RANK: 2.2218351 ROUND: 2.0
EXP: 2.2        REXP: 2.3666646 RANK: 2.4366665 ROUND: 2.0
EXP: 2.4        REXP: 2.4445183 RANK: 2.6444519 ROUND: 3.0
EXP: 2.6        REXP: 2.5367393 RANK: 2.853674  ROUND: 3.0
EXP: 2.8        REXP: 2.633957  RANK: 3.0633957 ROUND: 3.0
EXP: 3.0        REXP: 2.7041435 RANK: 3.2704144 ROUND: 3.0
EXP: 3.2        REXP: 2.751766  RANK: 3.4751766 ROUND: 3.0
EXP: 3.4        REXP: 2.8236845 RANK: 3.6823685 ROUND: 4.0
EXP: 3.6        REXP: 2.854854  RANK: 3.8854854 ROUND: 4.0
EXP: 3.8        REXP: 2.9142253 RANK: 4.0914226 ROUND: 4.0

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Ted Dunning <ted.dunning@gmail.com>
> To: mahout-user@lucene.apache.org
> Sent: Wednesday, June 3, 2009 5:23:15 PM
> Subject: Re: Inconsistent recommendations
> 
> My experience is that users like to see recommendations that change.
> 
> In fact, this preference is strong enough that I now typically add jitter to
> the recommendations that I return.  Typically I do this by computing a
> synthetic score used only to permute results lists:
> 
>      exp(-n/5) + rexp() * 0.1
> 
> Here rexp is an exponentially distributed random deviate that can be
> generated using Math.log(1-Math.random()).  The value n is the rank of the
> item (offset 0 or 1, doesn't matter).  The magic constants (/ 5 and * 0.1)
> must be tuned to fit the number of results you show and how you want to
> trade off stability versus novelty.  Usually I implement this as a
> meta-recommendation engine that uses the output of another engine as input
> and which returns permuted results.
> 
> The idea here is that the first few items will generally appear in the
> "correct" order.  Items beyond the top 10 will be dramatically shuffled.
> Here is a sequence of 20 draws for the top 20 items out of 200 (after
> permutation by synthetic score):
> 
> 2   1   4   3   5   8   7   6   14  13  10  15  37  30  146 40  94  76  172
> 125
> 1   2   4   6   3   8   9   5   11  13  15  7   169 190 174 44  90  171 95
> 74
> 1   2   3   4   5   6   9   7   11  8   15  14  16  17  53  81  34  33  37
> 30
> 1   2   3   4   5   6   8   10  7   9   15  12  19  88  121 30  55  43  200
> 168
> 1   2   4   3   5   7   8   10  6   13  11  19  17  133 139 124 194 123 79
> 186
> 1   2   3   4   5   6   9   11  8   12  10  13  14  19  151 53  102 48  117
> 169
> 1   3   2   5   7   4   6   10  12  15  24  59  83  61  156 148 109 28  188
> 126
> 1   2   3   4   6   7   5   8   10  12  14  11  192 57  54  182 38  158 128
> 123
> 1   3   4   6   5   7   8   11  9   10  13  15  2   12  24  28  43  179 180
> 100
> 1   2   3   4   6   5   7   8   9   10  11  25  15  12  83  124 59  45  169
> 199
> 1   2   3   4   5   7   8   11  6   14  26  85  69  163 40  58  12  182 144
> 109
> 1   2   4   5   3   6   7   8   10  12  9   20  22  109 43  108 27  62  157
> 84
> 1   2   4   3   5   7   8   10  6   9   12  11  16  13  17  15  140 39  122
> 190
> 1   3   2   5   4   7   8   10  16  14  11  15  41  38  42  100 171 68  113
> 178
> 1   2   3   4   5   6   7   8   10  12  11  194 89  43  80  129 126 181 94
> 140
> 1   3   2   5   4   7   6   8   11  10  13  12  9   19  20  53  99  30  183
> 115
> 1   2   3   5   6   4   7   8   10  11  13  16  21  23  153 82  52  163 31
> 186
> 1   2   3   4   5   6   9   10  13  18  16  11  19  7   27  23  29  41  72
> 64
> 1   2   3   4   5   6   8   7   11  9   10  18  13  17  33  194 196 35  128
> 75
> 1   2   3   5   4   7   6   8   9   10  12  11  13  15  16  14  188 82  147
> 163
> 
> Thus, for the first row, we would present the second item from the
> recommendations first, followed by items 1, 4, 3 and 5.
> 
> In blind testing, I found users typically prefer jittered results
> significantly over unjittered results.  One interpretation for this is that
> this is simply a way to get them to look beyond the first page of results.
> Another is that they are more willing to look at lists that change.
> 
> That said, I have also found it helpful to make the results be static for a
> small period of time.  To ensure that, I typically seed a random number
> generate on each request with the user id and the current time in seconds
> rounded down to the time period of stability.  Recommendations can often be
> made fast enough that caching is of little interest, but if caching is used,
> the expiration times should ideally be synchronized with the reseeding to
> give the desired mix of stability and novelty.
> 
> On Wed, Jun 3, 2009 at 12:45 PM, Sean Owen wrote:
> 
> > 3) I suppose I think of computing recommendation as a
> > relatively-speaking infrequent event. You might compute them once a
> > day or hour. Or you compute on the fly and cache it, either externally
> > or in the framework. So, it shouldn't be the case that the same
> > recommendations are computed over and over in a row, where the
> > differences might become noticeable, in an application, to a user
> >
> >
> > Is it possible to guarantee the same recommendation, even when using
> > sampling, if the data doesn't change? wouldn't be too hard to always
> > use a local RNG and always seed it the same way, no. It would be a
> > performance hit.
> >
> > My first reaction though is #3 -- cache. Is that a feasible response?
> >
> 
> 
> 
> -- 
> Ted Dunning, CTO
> DeepDyve
> 
> 111 West Evelyn Ave. Ste. 202
> Sunnyvale, CA 94086
> http://www.deepdyve.com
> 858-414-0013 (m)
> 408-773-0220 (fax)


Mime
View raw message