mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Using Taste to recommend documents
Date Fri, 03 Apr 2009 04:54:40 GMT
You could do that. But then, the system would be recommending words to
documents! Not quite what you want. I assume you still want to
recommend documents to (real) users.

I would use other techniques to determine document similarity. Others
on this list can suggest ideas, but, simple metrics based on word
frequency should do well. Then, use that logic to create an
implementation of ItemSimilarity. Then build a DataModel, perhaps a
FileDataModel, maybe from a file containing user IDs, document IDs,
and preference values. Then try a GenericItemBasedRecommender based on
these components. We can discuss these more in detail later.

Assuming you go this way, a couple thousand documents (and a couple
thousand users?) should be no problem to process in memory. It should
be fast. I would, perhaps, make sure that your ItemSimilarity caches
results, or perhaps is based on pre-computed values, since that would
be slow to re-compute those over and over a runtime.

Sean

On Apr 3, 2009 7:14 AM, "Vinicius Carvalho" <viniciusccarvalho@gmail.com> wrote:

Hi there! I would like to build a document recommendation system, and one of
the approaches I wish to experiment is use taste for that task. One idea I
had was to model users as documents, words as items and word frequencies on
documents as preferences.

Am I going on the right direction here?

Also, I'm a bit afraid about memory consumption here. So far we only have 6k
documents (which may have a few hundred words per doc). But would taste
scale to lets say 100k documents with few hundreds of words?

Best regards

--
The intuitive mind is a sacred gift and the
rational mind is a faithful servant. We have
created a society that honors the servant and
has forgotten the gift.

Mime
View raw message