Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2DCAD200D14 for ; Tue, 3 Oct 2017 22:57:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2C597160BD7; Tue, 3 Oct 2017 20:57:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 72C251609DE for ; Tue, 3 Oct 2017 22:57:05 +0200 (CEST) Received: (qmail 20482 invoked by uid 500); 3 Oct 2017 20:57:04 -0000 Mailing-List: contact issues-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list issues@mahout.apache.org Received: (qmail 20473 invoked by uid 99); 3 Oct 2017 20:57:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Oct 2017 20:57:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id F05E3DC5E6 for ; Tue, 3 Oct 2017 20:57:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id hDuCPDFchc1C for ; Tue, 3 Oct 2017 20:57:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id AA3BC5F3D1 for ; Tue, 3 Oct 2017 20:57:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 86EBCE0EF5 for ; Tue, 3 Oct 2017 20:57:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id D1DA6242D1 for ; Tue, 3 Oct 2017 20:57:00 +0000 (UTC) Date: Tue, 3 Oct 2017 20:57:00 +0000 (UTC) From: "Pat Ferrel (JIRA)" To: issues@mahout.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 03 Oct 2017 20:57:06 -0000 [ https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190338#comment-16190338 ] Pat Ferrel commented on MAHOUT-2019: ------------------------------------ This may be a non-issue: Trevor said in email: {quote}The spark is included via maven classifier- the sbt line should be libraryDependencies += "org.apache.mahout" % "mahout-spark_2.11" % "0.13.1-SNAPSHOT" classifier "spark_2.1" {quote} > SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized > ------------------------------------------------------------------------------------------- > > Key: MAHOUT-2019 > URL: https://issues.apache.org/jira/browse/MAHOUT-2019 > Project: Mahout > Issue Type: Bug > Components: Math > Affects Versions: 0.13.0 > Reporter: Pat Ferrel > Assignee: Pat Ferrel > Fix For: 0.13.1 > > > DRMs get blockified into SparseRowMatrix instances if the density is low. But SRM inherits the implementation of method like "assign" from AbstractMatrix, which uses nest for loops to traverse rows. For multiplying 2 matrices that are extremely sparse, the kind if data you see in collaborative filtering, this is extremely wasteful of execution time. Better to use a sparse vector's iterateNonZero Iterator for some function types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)