mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Mahout performance issues
Date Thu, 01 Dec 2011 22:46:22 GMT
These users should cause problems though. They don't add anything to a set
of candidates. Taking them away means you can't recommend anything to them.
I doubt this is quite the issue.

(That item with 400K interactions might be just fine to remove!)

You are certainly bottleneck on item-item similarity, from your graph --
intersectionSize() is the heart of the loglikelihood computation.

I still do not understand why your proposed change does not solve the
problem! You can turn down the candidate set size as low as you want. At a
"reasonable" size quality will still be OK. I'm missing something here.

On Thu, Dec 1, 2011 at 10:35 PM, Daniel Zohar <dissoman@gmail.com> wrote:

> Sebastian, as I wrote before, it's the other way around. ~8.5M users had
> only chosen a single item. The item with the most interactions is about
> 400k.
> This is why I'm looking now into improving GenericBooleanPrefDataModel to
> not take into account users which made one interaction under the
> 'preferenceForItems' Map. What do you think about this approach?
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message