mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <>
Subject Re: Generating a Document Similarity Matrix
Date Tue, 08 Jun 2010 23:53:22 GMT
On Tue, Jun 8, 2010 at 4:45 PM, Sebastian Schelter

> The relation between these two problems (document similarity and item
> similarity in CF) is exactly like Sean pointed out: In the paper a document
> is a vector of term frequencies and the paper shows how to compute the
> pairwise similarities between those. To use this for collaborative
> filtering
> you actually just have to replace the document with an item which is a

vector of user preferences.

Yep, a vector is a vector is a vector.  (And when you're me, even if you
are *not* a vector, you might be a vector. ;) )

> It shouldn't be too hard to make this work on a DistributedRowMatrix too, I
> think. You already mentioned you wanna have it that way some time
> in MAHOUT-362 :)

Well indeed I did!


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message