mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Inconsistent recommendations
Date Wed, 03 Jun 2009 21:23:15 GMT
My experience is that users like to see recommendations that change.

In fact, this preference is strong enough that I now typically add jitter to
the recommendations that I return.  Typically I do this by computing a
synthetic score used only to permute results lists:

     exp(-n/5) + rexp() * 0.1

Here rexp is an exponentially distributed random deviate that can be
generated using Math.log(1-Math.random()).  The value n is the rank of the
item (offset 0 or 1, doesn't matter).  The magic constants (/ 5 and * 0.1)
must be tuned to fit the number of results you show and how you want to
trade off stability versus novelty.  Usually I implement this as a
meta-recommendation engine that uses the output of another engine as input
and which returns permuted results.

The idea here is that the first few items will generally appear in the
"correct" order.  Items beyond the top 10 will be dramatically shuffled.
Here is a sequence of 20 draws for the top 20 items out of 200 (after
permutation by synthetic score):

2   1   4   3   5   8   7   6   14  13  10  15  37  30  146 40  94  76  172
125
1   2   4   6   3   8   9   5   11  13  15  7   169 190 174 44  90  171 95
74
1   2   3   4   5   6   9   7   11  8   15  14  16  17  53  81  34  33  37
30
1   2   3   4   5   6   8   10  7   9   15  12  19  88  121 30  55  43  200
168
1   2   4   3   5   7   8   10  6   13  11  19  17  133 139 124 194 123 79
186
1   2   3   4   5   6   9   11  8   12  10  13  14  19  151 53  102 48  117
169
1   3   2   5   7   4   6   10  12  15  24  59  83  61  156 148 109 28  188
126
1   2   3   4   6   7   5   8   10  12  14  11  192 57  54  182 38  158 128
123
1   3   4   6   5   7   8   11  9   10  13  15  2   12  24  28  43  179 180
100
1   2   3   4   6   5   7   8   9   10  11  25  15  12  83  124 59  45  169
199
1   2   3   4   5   7   8   11  6   14  26  85  69  163 40  58  12  182 144
109
1   2   4   5   3   6   7   8   10  12  9   20  22  109 43  108 27  62  157
84
1   2   4   3   5   7   8   10  6   9   12  11  16  13  17  15  140 39  122
190
1   3   2   5   4   7   8   10  16  14  11  15  41  38  42  100 171 68  113
178
1   2   3   4   5   6   7   8   10  12  11  194 89  43  80  129 126 181 94
140
1   3   2   5   4   7   6   8   11  10  13  12  9   19  20  53  99  30  183
115
1   2   3   5   6   4   7   8   10  11  13  16  21  23  153 82  52  163 31
186
1   2   3   4   5   6   9   10  13  18  16  11  19  7   27  23  29  41  72
64
1   2   3   4   5   6   8   7   11  9   10  18  13  17  33  194 196 35  128
75
1   2   3   5   4   7   6   8   9   10  12  11  13  15  16  14  188 82  147
163

Thus, for the first row, we would present the second item from the
recommendations first, followed by items 1, 4, 3 and 5.

In blind testing, I found users typically prefer jittered results
significantly over unjittered results.  One interpretation for this is that
this is simply a way to get them to look beyond the first page of results.
Another is that they are more willing to look at lists that change.

That said, I have also found it helpful to make the results be static for a
small period of time.  To ensure that, I typically seed a random number
generate on each request with the user id and the current time in seconds
rounded down to the time period of stability.  Recommendations can often be
made fast enough that caching is of little interest, but if caching is used,
the expiration times should ideally be synchronized with the reseeding to
give the desired mix of stability and novelty.

On Wed, Jun 3, 2009 at 12:45 PM, Sean Owen <srowen@gmail.com> wrote:

> 3) I suppose I think of computing recommendation as a
> relatively-speaking infrequent event. You might compute them once a
> day or hour. Or you compute on the fly and cache it, either externally
> or in the framework. So, it shouldn't be the case that the same
> recommendations are computed over and over in a row, where the
> differences might become noticeable, in an application, to a user
>
>
> Is it possible to guarantee the same recommendation, even when using
> sampling, if the data doesn't change? wouldn't be too hard to always
> use a local RNG and always seed it the same way, no. It would be a
> performance hit.
>
> My first reaction though is #3 -- cache. Is that a feasible response?
>



-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message