mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Re: How about a LSH recommender ?
Date Wed, 13 Apr 2011 09:57:06 GMT
They are using PLSI which we already tried to implement in 
https://issues.apache.org/jira/browse/MAHOUT-106. We didn't get it 
scalable, as far as I remember the paper, they are doing a nasty trick 
when sending data to the reducers in a certain step so that they only 
have to load a certain portion of data into memory. I'm not sure this 
can be replicated in hadoop (would love to be proven wrong through).

They are also using LSH to cluster users by jaccard-coefficient, don't 
we already have code for this in org.apache.mahout.clustering.minhash ?

--sebastian

On 13.04.2011 10:49, Sean Owen wrote:
> One of the three approaches that they combine is latent semantic indexing --
> that is what I was referring to.
>
> On Wed, Apr 13, 2011 at 8:33 AM, Ted Dunning<ted.dunning@gmail.com>  wrote:
>
>> Sean,
>>
>> Do you mean LSI (latent semantic indexing)?  Or LSH (locality sensitive
>> hashing)?
>>
>> (are you a victim of agressive error correction?)
>>
>> (or am I the victim of too little?)
>>
>>


Mime
View raw message