mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Errors in SSVD
Date Wed, 17 Aug 2011 22:34:06 GMT
out-of-core = not-in-memory.

The origins are *very* old.  Nobody I know has used core memory since the
70's.

http://en.wikipedia.org/wiki/Out-of-core_algorithm

On Wed, Aug 17, 2011 at 3:00 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> Ted,
> sorry for my stupid question of the day: what does the "out-of-core" term
> mean?
>  On Aug 16, 2011 2:18 PM, "Ted Dunning" <ted.dunning@gmail.com> wrote:
> > I have several in-memory implementations almost ready to publish.
> >
> > These provide straightforward implementation of the original SSVD
> algorithm
> > from the Martinsson and Halko paper, a version that avoids QR and LQ
> > decompositions and an out-of-core version that only keeps a moderate
> sized
> > amount of data in memory at any time.
> >
> > My hangup at this point is getting my Cholesky decomposition reliable for
> > rank-deficient inputs.
> >
> > On Tue, Aug 16, 2011 at 1:57 PM, Eshwaran Vijaya Kumar <
> > evijayakumar@mozilla.com> wrote:
> >
> >> I have decided to do something similar: Do the pipeline in memory and
> not
> >> invoke map-reduce for small datasets which I think will handle the
> issue.
> >> Thanks again for clearing that up.
> >> Esh
> >>
> >> Aug 16, 2011, at 1:45 PM, Dmitriy Lyubimov wrote:
> >>
> >> > PPS Mahout also has in-memory SVD Colt-migrated solver which is BTW
> what
> >> i
> >> > am using int local tests to assert SSVD results. Although it starts to
> >> feel
> >> > slow pretty quickly and sometimes produces errors (i think i starts
> >> feeling
> >> > slow at 10k x 1k inputs)
> >> >
> >> > On Tue, Aug 16, 2011 at 12:52 PM, Dmitriy Lyubimov <dlieu.7@gmail.com
> >> >wrote:
> >> >
> >> >> also, with data as small as this, stochastic noise ratio would be
> >> >> significant (as in 'big numbers' law) so if you really think you
> might
> >> need
> >> >> to handle inputs that small, you better write a pipeline that detects
> >> this
> >> >> as a corner case and just runs in-memory decomposition. In fact, i
> think
> >> >> dense matrices up to 100,000 rows can be quite comfortably computed
> >> >> in-memory (Ted knows much more on practical limits of tools like R
or
> >> even
> >> >> as simple as apache.math)
> >> >>
> >> >> -d
> >> >>
> >> >>
> >> >> On Tue, Aug 16, 2011 at 12:46 PM, Dmitriy Lyubimov <
> dlieu.7@gmail.com
> >> >wrote:
> >> >>
> >> >>> yep that's what i figured. you have 193 rows or so but distributed
> >> between
> >> >>> 7 files so they are small and would generate several mappers and
> there
> >> are
> >> >>> probably some there with a small row count.
> >> >>>
> >> >>> See my other email. This method is for big data, big files. If
you
> want
> >> to
> >> >>> automate handling of small files, you can probably do some
> intermediate
> >> step
> >> >>> with some heuristic that merges together all files say shorter
than
> >> 1Mb.
> >> >>>
> >> >>> -d
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Tue, Aug 16, 2011 at 12:43 PM, Eshwaran Vijaya Kumar <
> >> >>> evijayakumar@mozilla.com> wrote:
> >> >>>
> >> >>>> Number of mappers is 7. DFS block size is 128 MB, the reason
I
> think
> >> >>>> there are 7 mappers being used is that I am using a Pig script
to
> >> generate
> >> >>>> the sequence file of Vectors and that script generates 7 reducers.
> I
> >> am not
> >> >>>> setting minSplitSize though.
> >> >>>>
> >> >>>> On Aug 16, 2011, at 12:15 PM, Dmitriy Lyubimov wrote:
> >> >>>>
> >> >>>>> Hm. This is not common at all.
> >> >>>>>
> >> >>>>> This error would surface if map split can't accumulate
at least
> k+p
> >> >>>> rows.
> >> >>>>>
> >> >>>>> That's another requirement which usually is non-issue --
any
> >> >>>> precomputed
> >> >>>>> split must contain at least k+p rows, which normally would
not be
> the
> >> >>>> case
> >> >>>>> only if matrix is extra wide and dense, in which case
> --minSplitSize
> >> >>>> must be
> >> >>>>> used to avoid this.
> >> >>>>>
> >> >>>>> But in your case, the matrix is so small it must fit in
one split.
> >> Can
> >> >>>> you
> >> >>>>> please verify how many mappers the job generates?
> >> >>>>>
> >> >>>>> if it's more than 1 than something is going fishy with
hadoop.
> >> >>>> Otherwise,
> >> >>>>> something is fishy with input (it's either not 293 rows,
or k+p is
> >> more
> >> >>>> than
> >> >>>>> 293).
> >> >>>>>
> >> >>>>> -d
> >> >>>>>
> >> >>>>> On Tue, Aug 16, 2011 at 11:39 AM, Eshwaran Vijaya Kumar
<
> >> >>>>> evijayakumar@mozilla.com> wrote:
> >> >>>>>
> >> >>>>>>
> >> >>>>>> On Aug 16, 2011, at 10:35 AM, Dmitriy Lyubimov wrote:
> >> >>>>>>
> >> >>>>>>> This is unusually small input. What's the block
size? Use large
> >> >>>> blocks
> >> >>>>>> (such
> >> >>>>>>> as 30,000). Block size can't be less than k+p.
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>> I did set blockSize to 30,000 (as recommended in the
PDF that you
> >> >>>> wrote
> >> >>>>>> up). As far as input size, the reason to do that is
because it is
> >> >>>> easier to
> >> >>>>>> test and verify the map-reduce pipeline with my in-memory
> >> >>>> implementation of
> >> >>>>>> the algorithm.
> >> >>>>>>
> >> >>>>>>> Can you please cut and paste actual log of qjob
tasks that
> failed?
> >> >>>> This
> >> >>>>>> is
> >> >>>>>>> front end error, but the actual problem is actually
in the
> backend
> >> >>>>>> ranging
> >> >>>>>>> anywhere from hadoop problems to algorithm problems.
> >> >>>>>> Sure. Refer http://esh.pastebin.mozilla.org/1302059
> >> >>>>>> Input is a DistributedRowMatrix 293 X 236, k = 4, p
= 40,
> >> >>>> numReduceTasks =
> >> >>>>>> 1, blockHeight = 30,000. Reducing p = 20 ensures job
goes
> through...
> >> >>>>>>
> >> >>>>>> Thanks again
> >> >>>>>> Esh
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>> On Aug 16, 2011 9:44 AM, "Eshwaran Vijaya Kumar"
<
> >> >>>>>> evijayakumar@mozilla.com>
> >> >>>>>>> wrote:
> >> >>>>>>>> Thanks again. I am using 0.5 right now. We
will try to patch it
> up
> >> >>>> and
> >> >>>>>> see
> >> >>>>>>> how it performs. In the mean time, I am having
another (possibly
> >> >>>> user?)
> >> >>>>>>> error: I have a 260 X 230 matrix. I set k+p = 40,
it fails with
> >> >>>>>>>>
> >> >>>>>>>> Exception in thread "main" java.io.IOException:
Q job
> >> unsuccessful.
> >> >>>>>>>> at
> >> >>>> org.apache.mahout.math.hadoop.stochasticsvd.QJob.run(QJob.java:349)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:262)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>
> >> org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.run(SSVDCli.java:91)
> >> >>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >> >>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.main(SSVDCli.java:131)
> >> >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >> >>>>>>>> at
> >> >>>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >> >>>>>>>> at
> >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
> >> >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >> >>>>>>>> at
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Suppose I set k+p to be much lesser say around
20, it works
> fine.
> >> Is
> >> >>>> it
> >> >>>>>>> just that my dataset is of low rank or is there
something else
> >> going
> >> >>>> on
> >> >>>>>>> here?
> >> >>>>>>>>
> >> >>>>>>>> Thanks
> >> >>>>>>>> Esh
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Aug 14, 2011, at 1:47 PM, Dmitriy Lyubimov
wrote:
> >> >>>>>>>>
> >> >>>>>>>>> ... i need to let some time for review
before pushing to ASF
> repo
> >> >>>> )..
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> On Sun, Aug 14, 2011 at 1:47 PM, Dmitriy
Lyubimov <
> >> >>>> dlieu.7@gmail.com>
> >> >>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> patch is posted as MAHOUT -786.
> >> >>>>>>>>>>
> >> >>>>>>>>>> also 0.6 trunk with patch applied is
here :
> >> >>>>>>>>>> https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-786
> >> >>>>>>>>>>
> >> >>>>>>>>>> <https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-786
> >I
> >> >>>> will
> >> >>>>>>> commit
> >> >>>>>>>>>> to ASF repo tomorrow night (even that
it is extremely simple,
> i
> >> >>>> need
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Sat, Aug 13, 2011 at 1:48 PM, Eshwaran
Vijaya Kumar <
> >> >>>>>>>>>> evijayakumar@mozilla.com> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>>> Dmitriy,
> >> >>>>>>>>>>> That sounds great. I eagerly await
the patch.
> >> >>>>>>>>>>> Thanks
> >> >>>>>>>>>>> Esh
> >> >>>>>>>>>>> On Aug 13, 2011, at 1:37 PM, Dmitriy
Lyubimov wrote:
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>> Ok, i got u0 working.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> The problem is of course that
something called BBt job is
> to
> >> be
> >> >>>>>>> coerced
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>> have 1 reducer (it's fine,
every mapper won't yeld more
> than
> >> >>>>>>>>>>>> upper-triangular matrix of
k+p x k+p geometry, so even if
> you
> >> >>>> end up
> >> >>>>>>>>>>> having
> >> >>>>>>>>>>>> thousands of them, reducer
would sum them up just fine.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> it worked before apparently
because configuration hold 1
> >> reducer
> >> >>>> by
> >> >>>>>>>>>>> default
> >> >>>>>>>>>>>> if not set explicitly, i am
not quite sure if that's
> something
> >> >>>> in
> >> >>>>>>> hadoop
> >> >>>>>>>>>>> mr
> >> >>>>>>>>>>>> client or mahout change that
now precludes it from working.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> anyway, i got a patch (really
a one-liner) and an example
> >> >>>> equivalent
> >> >>>>>>> to
> >> >>>>>>>>>>>> yours worked fine for me with
3 reducers.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Also, in the tests, it also
requests 3 reducers, but the
> >> reason
> >> >>>> it
> >> >>>>>>> works
> >> >>>>>>>>>>> in
> >> >>>>>>>>>>>> tests and not in distributed
mapred is because local mapred
> >> >>>> doesn't
> >> >>>>>>>>>>> support
> >> >>>>>>>>>>>> multiple reducers. I investigated
this issue before and
> >> >>>> apparently
> >> >>>>>>> there
> >> >>>>>>>>>>>> were a couple of patches floating
around but for some
> reason
> >> >>>> those
> >> >>>>>>>>>>> changes
> >> >>>>>>>>>>>> did not take hold in cdh3u0.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I will publish patch in a jira
shortly and will commit it
> >> >>>>>> Sunday-ish.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Thanks.
> >> >>>>>>>>>>>> -d
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> On Fri, Aug 5, 2011 at 7:06
PM, Eshwaran Vijaya Kumar <
> >> >>>>>>>>>>>> evijayakumar@mozilla.com>
wrote:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>> OK. So to add more info
to this, I tried setting the
> number
> >> of
> >> >>>>>>> reducers
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>>> 1 and now I don't get that
particular error. The singular
> >> >>>> values
> >> >>>>>> and
> >> >>>>>>>>>>> left
> >> >>>>>>>>>>>>> and right singular vectors
appear to be correct though
> >> >>>> (verified
> >> >>>>>>> using
> >> >>>>>>>>>>>>> Matlab).
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> On Aug 5, 2011, at 1:55
PM, Eshwaran Vijaya Kumar wrote:
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> All,
> >> >>>>>>>>>>>>>> I am trying to test
Stochastic SVD and am facing some
> errors
> >> >>>> where
> >> >>>>>>> it
> >> >>>>>>>>>>>>> would be great if someone
could clarifying what is going
> on.
> >> I
> >> >>>> am
> >> >>>>>>>>>>> trying to
> >> >>>>>>>>>>>>> feed the solver a DistributedRowMatrix
with the exact same
> >> >>>>>> parameters
> >> >>>>>>>>>>> that
> >> >>>>>>>>>>>>> the test in LocalSSVDSolverSparseSequentialTest
uses, i.e,
> >> >>>> Generate
> >> >>>>>> a
> >> >>>>>>>>>>> 1000
> >> >>>>>>>>>>>>> X 100 DRM with SequentialSparseVectors
and then ask for
> >> >>>> blockHeight
> >> >>>>>>>>>>> 251, p
> >> >>>>>>>>>>>>> (oversampling) = 60, k
(rank) = 40. I get the following
> >> error:
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Exception in thread
"main" java.io.IOException:
> Unexpected
> >> >>>> overrun
> >> >>>>>>> in
> >> >>>>>>>>>>>>> upper triangular matrix
files
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.loadUpperTriangularMatrix(SSVDSolver.java:471)
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:268)
> >> >>>>>>>>>>>>>> at com.mozilla.SSVDCli.run(SSVDCli.java:89)
> >> >>>>>>>>>>>>>> at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >> >>>>>>>>>>>>>> at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >> >>>>>>>>>>>>>> at com.mozilla.SSVDCli.main(SSVDCli.java:129)
> >> >>>>>>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >> Method)
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >> >>>>>>>>>>>>>> at
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >>
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>>>>>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>>>>>>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Also, I am using CDH3
with Mahout recompiled to work with
> >> CDH3
> >> >>>>>> jars.
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Thanks
> >> >>>>>>>>>>>>>> Esh
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message