Many Thanks Owen for the prompt replies.
Will update the results on the quality of recommendations here.
Original Message
From: Sean Owen [mailto:srowen@gmail.com]
Sent: 18 October 2012 18:01
To: user@mahout.apache.org
Subject: Re: PseudoInverse map reduce implementation
So you have a factorization like A = X * Y' and you are looking for the right inverse of Y'
(where Y is the itemfeature matrix)?
This is just Y * pinv(Y' * Y). Y' * Y takes a little work to compute, but can be done in one
pass over the matrix. Y' * Y is just a
1000x1000 matrix which you can invert in memory quickly. Then it's another multiply. It shouldn't
take 40 seconds  but, it is also something you need not compute at request time every time.
It's not going to affect things much to just periodically recompute that if you always want
a completely uptodate rightinverse, because Y won't change rapidly.
Sean
On Thu, Oct 18, 2012 at 1:21 PM, Ranjith Uthaman <ranjith.uthaman@flytxt.com> wrote:
> The final pursuit is building a contentbased recommender of the item for each user.
Userbased and itembased recommenders of mahout as discussed in MahoutInAction book doesn't
fare very well considering the data available. Also, a contentbased recommender approach
is also hinted in the book.
> Hence, We intend to use linear regression kindof model for achieving better recommendations.
The confidential nature of data does not allow it to be discussed here :( , but the scale
at which this needs to be performed is as follows:
> The number of users are : 510 million Number of items are : ~10000
> [which might increase to million in future] Feature vector of the item
> is: 1000 [which might increase to 10000 features in future]
>
> We need to find the weight vector using the pseudo inverse of the item matrix and essentially
for per user the matrix dimensions is 10000 X 1000. But, since the number of users are large
and this needs to be done more frequent.
> On a single desktop machine with 2core and average configuration pinv of a matrix of
such dimension takes around 40 seconds .
> This time is too long for customers using mobile web portals whose index page is completely
customised using the recommendations results obtained above. Not to mention that , rendering
of the results to create the page will take further computational time.
>
> Kindly guide.
>
> Thanks & Regards,
> Ranjith
>
>
> Original Message
> From: Sean Owen [mailto:srowen@gmail.com]
> Sent: 18 October 2012 12:48
> To: user@mahout.apache.org
> Subject: Re: PseudoInverse map reduce implementation
>
> I asked in reply on Quora  what exactly are you computing? what is the size of input
and are you talking about a generalized inverse.
> Depending on this there are easier ways than an SVD.
>
> On Thu, Oct 18, 2012 at 6:42 AM, Ranjith Uthaman <ranjith.uthaman@flytxt.com> wrote:
>> Hi,
>>
>> Does map reduce implementation of PseudoInverse of a matrix exist in the current
Mahout framework? What are the various ways to achieve it?
>>
>> Thanks & Regards,
>> RANJITH P UTHAMAN
