mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Problem of dimensions
Date Mon, 14 Jul 2014 17:58:40 GMT
On Mon, Jul 14, 2014 at 9:47 AM, Pat Ferrel <> wrote:

> BTW that requires that drm.nrow be mutable. That is defined as immutable
> in the DSL and so will require a change to several traits. I’ve done this
> but am still trying to decide the cleanest.

Hmmm.... immutability has lots of virtues.  And changing nrows is just the
tip of the iceberg.  You also have to shuffle the rows to match the row
partitioning between the two matrices.

Or it requires more than one pass through the data.  Since you have to read
both matrices before you can deal with either, and since one matrix is
likely to be shuffled relative to the other, might it just be better to
either do two read passes or pay the cost to shuffle the matrices after
getting a consensus view. Note that the second read pass will have to do a
shuffle any way so the only savings to doing two passes is to decrease
memory usage.


I think I remember you were addressing a shuffle problem in some of your
earlier work.  What did you conclude?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message