mahout-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
Date Sat, 18 Nov 2017 20:52:00 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258207#comment-16258207
] 

Hudson commented on MAHOUT-2019:
--------------------------------

FAILURE: Integrated in Jenkins build Mahout-Quality #3511 (See [https://builds.apache.org/job/Mahout-Quality/3511/])
MAHOUT-2019 SparkRow Matrix Speedup and fixing change to scala 2.11 made (pat: rev 800a9ed6d7e015aa82b9eb7624bb441b71a8f397)
* (edit) math/src/main/java/org/apache/mahout/math/SparseRowMatrix.java


> SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
> -------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-2019
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-2019
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.13.0
>            Reporter: Pat Ferrel
>            Assignee: Pat Ferrel
>             Fix For: 0.13.1
>
>
> DRMs get blockified into SparseRowMatrix instances if the density is low. But SRM inherits
the implementation of method like "assign" from AbstractMatrix, which uses nest for loops
to traverse rows. For multiplying 2 matrices that are extremely sparse, the kind if data you
see in collaborative filtering, this is extremely wasteful of execution time. Better to use
a sparse vector's iterateNonZero Iterator for some function types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message