Yes, but y probably has about as much data as A had. A is sparse, y is
dense, but they have about the same number of rows. Y will have dozens to
hundreds of nonzero elements per row which is probably comparable to A.
This means that dealing with y'y implicitly at the cost of passing over y is
a major loss. Better to go ahead and form y' y on the fly as y is formed,
do an inmemory eigenvector computation and then pass over y to get Q (in
the original algorithm notation). Q is also quite large.
Forming Q' A can be done in a merge step similar to the production of AR to
get S which is wide rather than tall like y. As such, it is probably better
to compute S' and proceed as before. It isn't clear to me just offhand how
this might be done with rowwise access to A, but that would be a key step..
On Mon, Sep 28, 2009 at 1:05 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
> When decomposing y'y, I tend to use a similar trick to this, which is
> probably
> equivalent: to multiply matrix by vector (y'y)*u, just do the vector sum of
> y_i * (y_i.dot(u)) over the rows y_i of y (in which case you store y, but
> never
> bother to store y'y  and just like the outerproduct sum, the vector sum
> can
> be done partially in the combiners as well). Of course, y'y is much
> smaller
> than y, so avoiding saving y'y is not really that much of a win  but or
> this
> algorithm, it seems you do probably need both y and y'y, in which case it
> might be nice to not have to ever really deal with y'y explicitly.
>

Ted Dunning, CTO
DeepDyve
