mahout-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pat Ferrel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
Date Tue, 03 Oct 2017 20:57:00 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190338#comment-16190338
] 

Pat Ferrel commented on MAHOUT-2019:
------------------------------------

This may be a non-issue: 

Trevor said in email:

{quote}The spark is included via maven classifier-

the sbt line should be

libraryDependencies += "org.apache.mahout" % "mahout-spark_2.11" %
"0.13.1-SNAPSHOT" classifier "spark_2.1"


{quote}

> SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
> -------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-2019
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-2019
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.13.0
>            Reporter: Pat Ferrel
>            Assignee: Pat Ferrel
>             Fix For: 0.13.1
>
>
> DRMs get blockified into SparseRowMatrix instances if the density is low. But SRM inherits
the implementation of method like "assign" from AbstractMatrix, which uses nest for loops
to traverse rows. For multiplying 2 matrices that are extremely sparse, the kind if data you
see in collaborative filtering, this is extremely wasteful of execution time. Better to use
a sparse vector's iterateNonZero Iterator for some function types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message