mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen" <>
Subject Re: Trimming Taste input (memory consumption)
Date Thu, 23 Oct 2008 09:37:38 GMT
On Wed, Oct 22, 2008 at 5:52 PM, Otis Gospodnetic
<> wrote:
> So here are my questions:
> - Is there a point in keeping and loading very unpopular items (e.g.
> the ones read only once)?  I think keeping those might help very few
> people discover very obscure items, so removing them will hurt this
> small subset of people a bit, but this will not affect the majority of
> people.  Is this thinking correct?

I agree, it makes sense to trim data in this way. I tried to build in
"levers" of this sort in several places in the code. If you mention
what implementation you are using I can recommend some parameters to
look at.

> - I'm dealing with items where their freshness counts.  I don't want to recommend items
older than N days - think news stories.  Assume I have the age of each item.  I could certainly
then remove old items as I don't ever want to recommend them, but if I remove them, won't
that hurt the quality of recommendations, simply because I'll lose users' "item consumption

Yes they are still valuable data points even if they are not
recommendable items. You can use a Rescorer to exclude items from
recommendations according to any criteria you like. This is easier and
more efficient than filtering after the fact.


View raw message