spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: QR decomposition in Spark ALS
Date Thu, 06 Mar 2014 18:51:44 GMT
But Sean, because that matrix is not invertible, you can’t solve it. That’s why I’m saying,
as long as it is solvable, it will be positive definite too, and in that case solvePositive
is optimized for this use case (I believe it does Cholesky decomposition).

Matei


On Mar 6, 2014, at 9:58 AM, Sean Owen <sowen@cloudera.com> wrote:

> Agree. For example you could have a user-feature matrix X =
> 
> 1 0
> 1 0
> 
> and X' * X =
> 
> 2 0
> 0 0
> 
> is not positive definite but is semidefinite.
> 
> So I think the code should be calling solveSymmetric not
> solvePositive? the latter requires positive definite.
> 
> 
> --
> Sean Owen | Director, Data Science | London
> 
> 
> On Thu, Mar 6, 2014 at 5:39 PM, Matei Zaharia <matei.zaharia@gmail.com> wrote:
>> Xt*X should mathematically always be positive semi-definite, so the only way this
might be bad is if it’s not invertible due to linearly dependent rows. This might happen
due to the initialization or possibly due to numerical issues, though it seems unlikely. Maybe
it also happens if some users rated no products, or something like that.
>> 
>> Matei
>> 
>> On Mar 6, 2014, at 7:59 AM, Sean Owen <sowen@cloudera.com> wrote:
>> 
>>> Hmm, Will Xt*X be positive definite in all cases? For example it's not
>>> if X has linearly independent rows? (I'm not going to guarantee 100%
>>> that I haven't missed something there.)
>>> 
>>> Even though your data is huge, if it was generated by some synthetic
>>> process, maybe it is very low rank?
>>> 
>>> QR decomposition is pretty good here, yes.
>>> --
>>> Sean Owen | Director, Data Science | London
>>> 
>>> 
>>> On Thu, Mar 6, 2014 at 3:05 PM, Debasish Das <debasish.das83@gmail.com>
wrote:
>>>> Hi Sebastian,
>>>> 
>>>> Yes Mahout ALS and Oryx runs fine on the same matrix because Sean calls QR
>>>> decomposition.
>>>> 
>>>> But the ALS objective should give us strictly positive definite matrix..I
>>>> am thinking more on it..
>>>> 
>>>> There are some random factor assignment step but that also initializes
>>>> factors with normal(0,1)...which I think is not a big deal...
>>>> 
>>>> About QR decomposition, jblas has Solve.solve and
>>>> Solve.solvePositive...solve should also run fine but it does a LU
>>>> factorization and better way will be to do a QR decomposition.
>>>> 
>>>> Seems breeze has QR decomposition and we can make use of that...But QR by
>>>> default is also not correct since if the matrix is positive definite BLAS
>>>> psov (Solve.solvePositive) is much faster due to cholesky computation..
>>>> 
>>>> I believe we need a singular value check and based on that we should call
>>>> solvePositive/solve or qr from breeze..
>>>> 
>>>> There is also a specialized version of TSQR (tall and skinny QR
>>>> decomposition) from Chris which might be good to evaluate as well:
>>>> 
>>>> https://github.com/ccsevers/scalding-linalg
>>>> 
>>>> I am going to debug it further and try to understand why I am getting non
>>>> positive definite matrix and publish the findings...
>>>> 
>>>> Any suggestions on how to proceed further on this ? Should I ask for a PR
>>>> and we can discuss more on it ?
>>>> 
>>>> Thanks.
>>>> Deb
>>>> 
>>>> On Thu, Mar 6, 2014 at 6:47 AM, Sebastian Schelter <ssc@apache.org>
wrote:
>>>> 
>>>>> I'm not sure about the mathematical details, but I found in some
>>>>> experiments with Mahout that the matrix there was also not positive
>>>>> definite. Therefore, we chose QR decomposition to solve the linear system.
>>>>> 
>>>>> 
>>>>> --sebastian
>>>>> 
>>>>> 
>>>>> On 03/06/2014 03:44 PM, Debasish Das wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am running ALS on a sparse problem (10M x 1M) and I am getting
the
>>>>>> following error:
>>>>>> 
>>>>>> org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading
minor
>>>>>> of order i of A is not positive definite.
>>>>>> at org.jblas.SimpleBlas.posv(SimpleBlas.java:373)
>>>>>> at org.jblas.Solve.solvePositive(Solve.java:68)
>>>>>> 
>>>>>> This error from blas shows up if the hessian matrix is not positive
>>>>>> definite...
>>>>>> 
>>>>>> I checked that rating matrix is all > 0 but of course like netflix
they
>>>>>> are
>>>>>> not bounded within 1 and 5....
>>>>>> 
>>>>>> Is there some sort of specific initialization of factor matrices
done
>>>>>> which
>>>>>> can make the hessian matrix non positive definite ?
>>>>>> 
>>>>>> I am printing out the eigen vectors and fullXtX matrix to understand
it
>>>>>> more but any help will be appreciated.
>>>>>> 
>>>>>> Thanks.
>>>>>> Deb
>>>>>> 
>>>>>> 
>>>>> 
>> 


Mime
View raw message