mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: NaN produced by SSVD ?
Date Mon, 03 Nov 2014 22:18:00 GMT
Ok. so that's what i suspected.

The method generally is not intended to run on inputs with ranks smaller
than k+p parameters. MR version doesn't even check for it.

However as i mentioned in manual, i did run tests with -q=0 in which case
correspondent singular vectors on the right should be reset to 0.0, not
NaNs . It is possible that with -q=1 power iterations do something
inadmissible in that situation.

just for the record, what -q setting have you used?

On Mon, Nov 3, 2014 at 2:00 PM, Yang <teddyyyy123@gmail.com> wrote:

> it does have something to do with K. previously I used a formular to
> determine my rank to use by
>
> rank = N - p - 1 = 64 - 5 -1   = 58 , where N is the number of columns of
> the original matrix.
>
> then I tried using rank = 50, it worked.
>
> well.... as I write this email, I realized that the reason might be that
> the actual rank R of the original matrix may be much smaller than N, that
> could be the reason. but it is a bit difficult to figure out that R
> beforehand.
>
>
> thanks
> Yang
>
> On Fri, Oct 31, 2014 at 5:01 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>
> wrote:
>
> > is the matrix by any chance constructed so that it may have rank < k? I
> > think MR code is not checking for that.
> >
> > In spark shell i have :
> >
> > mahout> val a = dense( (0,0),(0,0) )
> > a: org.apache.mahout.math.DenseMatrix =
> > {
> >   0  => {}
> >   1  => {}
> > }
> > mahout> svd(a)
> > res0: (org.apache.mahout.math.Matrix, org.apache.mahout.math.Matrix,
> > org.apache.mahout.math.DenseVector) =
> > ({
> >   0  => {0:1.0}
> >   1  => {1:1.0}
> > },{
> >   0  => {0:-1.0}
> >   1  => {1:-1.0}
> > },{})
> >
> > But :
> >
> > mahout> ssvd(a,2,0)
> >
> > java.lang.AssertionError: assertion failed: Rank-deficiency detected
> during
> > s-SVD
> >
> > or
> > mahout> val drmA = drmParallelize(a,2)
> > mahout> dssvd(drmA, k=2)
> > java.lang.IllegalArgumentException: R is rank-deficient.
> >
> >
> > the MR version doesn't check for these effects and it may create some
> > degenerate results, although i thought those should be 0s, at least when
> > -q=0. I am not sure for -q=1,2...
> >
> >
> >
> >
> > On Thu, Oct 30, 2014 at 10:35 PM, Yang <teddyyyy123@gmail.com> wrote:
> >
> > > i am talking about the MR one.
> > >
> > > thanks
> > > yang
> > > On Oct 30, 2014 8:16 PM, "Dmitriy Lyubimov" <dlieu.7@gmail.com> wrote:
> > >
> > > > This is not a known problem...
> > > >
> > > > there are few ssvd here, sequential, MR and spark one. for the
> record,
> > > > which one are you running?
> > > >
> > > >
> > > >
> > > > On Thu, Oct 30, 2014 at 4:37 PM, Yang <teddyyyy123@gmail.com> wrote:
> > > >
> > > > > we are running ssvd on a dataset (this one is relatively small,
> with
> > > 8000
> > > > > rows, number of columns is 64 ),  we ran it with rank = 58, since
> > > > sampling
> > > > > p=5.
> > > > >
> > > > > the result had NaN on multiple columns.
> > > > >
> > > > > why would this appear ?
> > > > >
> > > > > I am now running with lower rank=20 , to see if it goes away.
> > > > >
> > > > >
> > > > > Thanks
> > > > > Yang
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message