[ https://issues.apache.org/jira/browse/MAHOUT1780?page=com.atlassian.jira.plugin.system.issuetabpanels:alltabpanel
]
Suneel Marthi updated MAHOUT1780:

Description:
Capturing here the Conversation on this subject:
{code}
Turns out that matrix view traversal (of dense matrices, anyway) is 4 times slower than regular
matrix traversal in the same direction. I.e.
Ad %*% Bd: (106.33333333333333,85.0)
Ad(r,::) %*% Bd: (356.0,328.0)
where r=0 until Ad.nrow.
Investigated MatrixView, it reports correct matrix flavor (as the owner's) and correct algorithm
is selected (the same as for the row above). Sure, MatrixView gives an indirection(sometimes
even double indirection) but 4x?? It should not be that much different from transpose view
overhead, and transpose view overhead is very small in the tests (compared to the rest of
the cost)
The main difference seems to be that the algorithm over matrices ends up doing a dot over
DenseVector and a DenseVector (even that the wrapper object is created inside the row iterations)
whereas the inefficient algorithm does the same over VectorView wrappers. I wonder if VectorView
has not been equipped to pass on the flavors of its backing vector to the vectorvector optimization.
Apparently the dot algorithm on vector view goes to the incore vectorvector optimization
framework (calls aggregate()) but denseVector applies custom iteration. Hence it may boil
down to experiments of avec dot bvec vs. avec(::) dot bvec(::).
{code}
was:
Capturing here the Conversation on this subject:
{quote}
Turns out that matrix view traversal (of dense matrices, anyway) is 4 times slower than regular
matrix traversal in the same direction. I.e.
Ad %*% Bd: (106.33333333333333,85.0)
Ad(r,::) %*% Bd: (356.0,328.0)
where r=0 until Ad.nrow.
Investigated MatrixView, it reports correct matrix flavor (as the owner's) and correct algorithm
is selected (the same as for the row above). Sure, MatrixView gives an indirection(sometimes
even double indirection) but 4x?? It should not be that much different from transpose view
overhead, and transpose view overhead is very small in the tests (compared to the rest of
the cost)
The main difference seems to be that the algorithm over matrices ends up doing a dot over
DenseVector and a DenseVector (even that the wrapper object is created inside the row iterations)
whereas the inefficient algorithm does the same over VectorView wrappers. I wonder if VectorView
has not been equipped to pass on the flavors of its backing vector to the vectorvector optimization.
Apparently the dot algorithm on vector view goes to the incore vectorvector optimization
framework (calls aggregate()) but denseVector applies custom iteration. Hence it may boil
down to experiments of avec dot bvec vs. avec(::) dot bvec(::).
{quote}
> Multithreaded Matrix Multiplication is slower than Singlethread variant
> 
>
> Key: MAHOUT1780
> URL: https://issues.apache.org/jira/browse/MAHOUT1780
> Project: Mahout
> Issue Type: Bug
> Components: Math
> Affects Versions: 0.10.0, 0.10.1, 0.10.2, 0.11.0
> Reporter: Suneel Marthi
> Assignee: Dmitriy Lyubimov
> Priority: Critical
> Fix For: 0.12.0, 0.13.0
>
>
> Capturing here the Conversation on this subject:
> {code}
> Turns out that matrix view traversal (of dense matrices, anyway) is 4 times slower than
regular matrix traversal in the same direction. I.e.
> Ad %*% Bd: (106.33333333333333,85.0)
> Ad(r,::) %*% Bd: (356.0,328.0)
> where r=0 until Ad.nrow.
> Investigated MatrixView, it reports correct matrix flavor (as the owner's) and correct
algorithm is selected (the same as for the row above). Sure, MatrixView gives an indirection(sometimes
even double indirection) but 4x?? It should not be that much different from transpose view
overhead, and transpose view overhead is very small in the tests (compared to the rest of
the cost)
> The main difference seems to be that the algorithm over matrices ends up doing a dot
over DenseVector and a DenseVector (even that the wrapper object is created inside the row
iterations) whereas the inefficient algorithm does the same over VectorView wrappers. I wonder
if VectorView has not been equipped to pass on the flavors of its backing vector to the vectorvector
optimization.
> Apparently the dot algorithm on vector view goes to the incore vectorvector optimization
framework (calls aggregate()) but denseVector applies custom iteration. Hence it may boil
down to experiments of avec dot bvec vs. avec(::) dot bvec(::).
> {code}

This message was sent by Atlassian JIRA
(v6.3.4#6332)
