mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Memory and Speed Questions for Item-Based-Recommender
Date Mon, 13 Jul 2009 15:11:36 GMT
Also, Lucene automagically does weighting which is very, very similar to
exactly what you want.

To Sean's question, the trick is that Lucene can store a list of item-item
links that were filtered by cooccurrence statistics to form a binary matrix
of interesting links.  Then if you query with a user's recent history of
items as a query, you get back a list of items formed by considering
different items to be weighted according to rarity.

The result is quite good, very fast.  The reasons are that Lucene *is*
weighted matrix multiplication of just the right sort.  This is what I was
going to talk about in detail at ApacheCon.

On Mon, Jul 13, 2009 at 4:16 AM, Grant Ingersoll <>wrote:

> I think Ted's suggestion is you'll find Lucene will be _a lot faster_ for
> this task as you don't need all the other trappings of a DB.
> On Jul 13, 2009, at 4:36 AM, Sean Owen wrote:
>  How does Lucene go from item-item links to recommendations? I'm
>> missing where the notion of user ratings, or even users, come into
>> play, or the strength of the association.
>> If the issue is really just storing the item-item links efficiently in
>> a way that isn't in memory, how about I cook up a JDBC-based
>> implementation? Seem more direct.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message