also, note that parallelizeEmpty() does not create anything but a standard
intkeyed matrix with all rows indexed accordingly. That means it cannot be
rbound with something that is not intkeyed (but perhaps it could be bound
after intermediate mapblock for keys).
On Mon, Jul 21, 2014 at 1:42 PM, Pat Ferrel <pat.ferrel@gmail.com> wrote:
> Thank you! This is what I understood and I’m doing a little dance for joy
> (in my mind). This makes sparseness all encompassing, at least for
> sequential Int keys.
>
> However Anand has found several math ops that don’t work.
>
> I’ll write up a few tests for transpose and multiply at least since these
> are used in cooccurrence. And I’ll be happy to implement something that
> changes nrow in an immutable Rlike way. Anand and Ted suggested rbind
> of drmParallelizeEmpty with added row cardinality. This would really only
> change nrow of the resulting CheckPointedDrm, it would not alter the rdd.
>
>
>
>
> On Jul 21, 2014, at 1:12 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:
>
> "missing" rows are only valid in context of intkeyed matrices and
> physical transposition operations. These are the only that may depend on
> it, since obviously one can't define "missingness" for something that is
> Stringkeyed.
>
> So the only thing that may fail because of "missingness" effect is
> probably physical transposition operator (we don't have test for such case,
> so maybe there's a bug in that case). Everything else should work.
>
> And no, i suppose it is ok to have "missing" rows even in case of
> intkeyed matrices.
>
> there's one thing that you probably should be aware in this context
> though: many algorithms don't survive empty (rowless) partitions, in
> whatever way they may come to be. Other than that, I don't feel every row
> must be present  even if there's implied order of the rows.
>
>
> On Mon, Jul 21, 2014 at 12:22 PM, Pat Ferrel <pat.ferrel@gmail.com> wrote:
>
>> I appreciate that you can’t read all the back and forth Dmitriy hence the
>> private email. Please disregard all other code or talk in the thread for
>> the moment.
>>
>> Does a DRM need to have a row for every sequential row key from 0 to
>> nrow1 ? Can there be missing row keys in the sequence and will they be
>> treated as {}, an all zero row? In terms of the rdd in the CheckpointedDrm
>> these “missing” rows will not have a corresponding n => {}, they will just
>> not exist in the rdd. This will happen when a row is “missing” from the DRM
>> but the true cardinality is known and passed in to the CheckpointedDrm
>> constructor.
>>
>> Will Rlike operations on these matrices work correctly. Will A.t %*% A
>> and A + 1 work correctly?
>>
>> The answer is no, but _should_ they work correctly?
>>
>>
>
>
