mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <>
Subject Re: Structure-based a %*% b optimization results.
Date Sat, 18 Apr 2015 00:31:22 GMT
i mean, before we consider hardware based implementations for bigger
matrices, this change seems like a very easy win

On Fri, Apr 17, 2015 at 5:26 PM, Dmitriy Lyubimov <> wrote:

> Spent an hour on this today.
> What i am doing: simply reimplementing pairwise dot-product algorithm in
> stock dense matrix times().
> However, equipping every matrix with structure "flavor" (i.e. dense(...)
> reports row-wise ,  and dense(...).t reports column wise, dense().t.t
> reports row-wise again, etc.)
> Next, wrote a binary operator that switches on combination of operand
> orientation and flips misaligned operand(s) (if any) to match most "speedy"
> orientation RW-CW. here are result for 300x300 dense matrix pairs:
> Ad %*% Bd: (107.125,46.375)
> Ad' %*% Bd: (206.475,39.325)
> Ad %*% Bd': (37.2,42.65)
> Ad' %*% Bd': (100.95,38.025)
> Ad'' %*% Bd'': (120.125,43.3)
> these results are for transpose combinations of original 300x300 dense
> random matrices, averaged over 40 runs (so standard error should be well
> controlled), in ms. First number is stock times() application (i.e. what
> we'd do with %*% operator now), and second number is ms with rewriting
> matrices into RW-CW orientation.
> For example, AB reorients B only, just like A''B'', AB' reorients nothing,
> and worst case A'B re-orients both (I also tried to run sum of outer
> products for A'B case without re-orientation -- apparently L1 misses far
> outweigh costs of reorientation there, i got very bad results there for
> outer product sum).
> as we can see, stock times() version does pretty bad for even dense
> operands for any orientation except for the optimal.
> Given that, i am inclined just to add orientation-driven structure
> optimization here and replace all stock calls with just orientation
> adjustment.
> Of course i will need to append this matrix with sparse and sparse row
> matrix combination (quite a bit of those i guess) and see what happens
> compared to stock sparse multiplications.
> But even that seems like a big win to me (basically, just doing
> reorientation optimization seems to give 3x speed up on average in
> matrix-matrix multiplication in 3 cases out of 4, and ties in 1 case).

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message