mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Would like some recommendation, need advice
Date Mon, 22 Jun 2009 20:20:17 GMT
It sounds like you want to pre-compute, and then save, the similarity
between each pair of items, and each pair of users? Yes you can do
that. You don't have to do that if you don't want to. Already you are
using things like TanimotoCoefficientSimilarity, which compute
similarity dynamically based on the data tables.

If you did want to make your own table to store these things, you
would also have to write a custom UserSimilarity or ItemSimilarity
class to read from that table. That is fairly easy.

But I think your table would be more like this:

user_a_id, user_b_id, similarity


I may be misunderstanding what you are trying to do, since it seems
like you are doing something a little non-standard. Normally you have
one data table, like:

user_id, item_id, preference

You have this extra notion of 'subject'. If you explain how this fits
in, maybe I can provide some better advice.


On Mon, Jun 22, 2009 at 4:15 PM, charlysf<> wrote:
> Hello,
> I would like some advice, now I have these tables in MYSQL :
> User_subject
> user_id, subject_id, relevance
> Item_subject
> item_id, subject_id
> I would like some advice to have some recommendations.
> Now, to compute the user similarity, I made a JDBCDataModel for the table
> User_subject.
> To compute the item similarity, I made the same, for the table item_subject.
> Now, I have my similarity between users, and between items.
> Do I need to make a table like that :
> user_item
> user_id, item_id, relevance
> I will have millions of rows, and I think it could be very slow no ?
> Thank you very much,
> --
> View this message in context:
> Sent from the Mahout User List mailing list archive at

View raw message