mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shmakov <>
Subject Re: Introducing randomness into my results
Date Sun, 03 Jul 2011 20:02:51 GMT
Insightful and interesting. But it seems that quantitative measure of
gain/loss from different methods  would help.

The question is how you measure the gain?

One example: suppose recommendations are ignored by 99% of the users and
there is some measurable action (now or later) from 1% of the users. Suppose
these actions from 1% of the users have some utility value for the
recommender - purchases, clicks, likes etc. Randomization could result in
missing opportunity to engage some of the 1% of these users but potentially
can engage some percentage of 99% of the users who previously were not
interested. Since we can not loose more than 1% but can gain a few % from
99% it is intuitively to think the strategy will have net gain.

It seems that as long as recommenders are dealing with the "economy of spam"
(most users are not interested) any additional engagement e.g. through
randomization or more reach recommendations would help. Is that right?

On Sun, Jul 3, 2011 at 11:07 AM, Ted Dunning <> wrote:

> Whether recommendations is the highest volume source of new information or
> not depends on the site.
> Clearly you need other mechanisms like search and most popular and most
> popular by genre and recently posted, but it is not unusual for these to be
> completely insufficient.  The alternatives can see the similarity search,
> but if recommendations are a dominant navigational means on the site then
> these seeds will never come up high enough to be confirmed.
> Another mechanism for this which is useful even in search is to give recent
> items a boost.
> Anti-flood doesn't actually help exploration that much except at a genre
> level (or whatever means you pick).  You still have the problem of good
> items being shadowed.
> It is not unusual for dithering to cause a dramatic and almost immediate
> boost in recommendation click-throughs.  THis probably has several
> components including:
> a) better and wider recommendations (but this happens on a longer time
> scale)
> b) the first page is no longer static so the user views recommendations as
> a
> source of new information so they come back
> c) given that the users are coming back to recommendations because of (b),
> we are turning these return visits into (effectively) views of the second
> and third pages of results.  Users don't tend to go to the next page even
> when they see that the item at the bottom of the first page is still pretty
> good.  Dithering brings that second page to them on the first page.
> I commonly do the dithering based on a seed that is rounded down to the
> nearest hour or so.  This gives a stable view on most refreshes, but then
> shows new results pretty soon.  Users build fanciful models of what is
> really happening, but it seems to engage them to have a predictable time
> that "new" results will appear.
> On Sun, Jul 3, 2011 at 2:43 AM, Sean Owen <> wrote:
> > On Sun, Jul 3, 2011 at 8:05 AM, Ted Dunning <>
> wrote:
> > > For instance, if the recommendation engine recommends B if you have
> seen
> > A
> > > and there is little other way to discover C which is ranked rather low
> > (and
> > > thus never seen), then there is no way for the engine to even get
> > training
> > > data about C.  The fact is, however, that exploring the space around
> good
> > > recommendations is a good thing to do.  This is referred to as the
> > explore /
> > > exploit trade-off in the multi-armed bandit literature.
> >
> > Agree, that's a good reason to mix it up. Recommendations are a
> > secondary source of possible new user-item interactions (i.e. that is
> > not the only way to discover C), but are far more productive at
> > driving serendipity than just waiting for it to happen. See below... I
> > guess I think of randomization as the crudest way to get this effect.
> > Surely your "anti-flooding" is more directed and effective?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message