Return-Path: Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: (qmail 18037 invoked from network); 28 Mar 2011 20:29:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Mar 2011 20:29:48 -0000 Received: (qmail 42203 invoked by uid 500); 28 Mar 2011 20:29:46 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 42152 invoked by uid 500); 28 Mar 2011 20:29:46 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 42102 invoked by uid 99); 28 Mar 2011 20:29:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Mar 2011 20:29:46 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Mar 2011 20:29:43 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id F0DEC84FC0 for ; Mon, 28 Mar 2011 20:29:05 +0000 (UTC) Date: Mon, 28 Mar 2011 20:29:05 +0000 (UTC) From: "Ted Dunning (JIRA)" To: dev@mahout.apache.org Message-ID: <1116838891.17449.1301344145983.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <2035372073.17043.1301332145857.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAHOUT-638) Stochastic svd's is not handling well all cases of sparse vectors MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAHOUT-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012205#comment-13012205 ] Ted Dunning commented on MAHOUT-638: ------------------------------------ I would drop most of the properties that specify versions. Defining a symbol that is used in one place doesn't really even save much. > Stochastic svd's is not handling well all cases of sparse vectors > ------------------------------------------------------------------ > > Key: MAHOUT-638 > URL: https://issues.apache.org/jira/browse/MAHOUT-638 > Project: Mahout > Issue Type: Bug > Components: Math > Affects Versions: 0.5 > Reporter: Dmitriy Lyubimov > Assignee: Dmitriy Lyubimov > Fix For: 0.5 > > Attachments: MAHOUT-638.patch > > > The Mahout patch of the algorithm is not handling all types of sparse input efficiently. BtJob doesn't handle SequentialSparseVector in a way to pick only non-zero elements from initial input and QJob doesn't iterate over RandomAccessSparseVector correctly. With extremely sparse inputs (0.05% non-zero elements) that leads to a terrible inefficiency in the aforementioned jobs (QJob, BtJob). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira