Return-Path: Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: (qmail 51852 invoked from network); 1 Mar 2011 22:53:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Mar 2011 22:53:59 -0000 Received: (qmail 954 invoked by uid 500); 1 Mar 2011 22:53:58 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 641 invoked by uid 500); 1 Mar 2011 22:53:58 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 633 invoked by uid 99); 1 Mar 2011 22:53:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 22:53:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 22:53:57 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 12D264A519 for ; Tue, 1 Mar 2011 22:53:37 +0000 (UTC) Date: Tue, 1 Mar 2011 22:53:37 +0000 (UTC) From: "Sean Owen (JIRA)" To: dev@mahout.apache.org Message-ID: <1289712606.6358.1299020017073.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <9737737.176311295917424725.JavaMail.jira@thor> Subject: [jira] Commented: (MAHOUT-593) Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure compatibility with current Mahout dependencies. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAHOUT-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001165#comment-13001165 ] Sean Owen commented on MAHOUT-593: ---------------------------------- You can't recover from failing to close a stream, but if you want to fail your processing if a stream can't be closed, OK. Is that wise? I don't understand the point about releasing a ref or ordering -- it is not different in this respect from the existing method. It iterates in order and keeps no references. There is also nothing that close()es more than once in either version -- and neither version prevents a caller from doing something like that. For those reasons I'm still puzzled on this piece of code but think it's minor enough I wouldn't push more on it. If it makes sense for you and nobody else has thoughts, it's fine. > Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure compatibility with current Mahout dependencies. > ---------------------------------------------------------------------------------------------------------------------- > > Key: MAHOUT-593 > URL: https://issues.apache.org/jira/browse/MAHOUT-593 > Project: Mahout > Issue Type: New Feature > Components: Math > Affects Versions: 0.4 > Reporter: Dmitriy Lyubimov > Fix For: 0.5 > > Attachments: MAHOUT-593.patch.gz, MAHOUT-593.patch.gz, MAHOUT-593.patch.gz, SSVD-givens-CLI.pdf, ssvdclassdiag.png > > > Current Mahout-376 patch requries 'new' hadoop API. Certain elements of that API (namely, multiple outputs) are not available in standard hadoop 0.20.2 release. As such, that may work only with either CDH or 0.21 distributions. > In order to bring it into sync with current Mahout dependencies, a backport of the patch to 'old' API is needed. > Also, some work is needed to resolve math dependencies. Existing patch relies on apache commons-math 2.1 for eigen decomposition of small matrices. This dependency is not currently set up in the mahout core. So, certain snippets of code are either required to go to mahout-math or use Colt eigen decompositon (last time i tried, my results were mixed with that one. It seems to produce results inconsistent with those from mahout-math eigensolver, at the very least, it doesn't produce singular values in sorted order). > So this patch is mainly moing some Mahout-376 code around. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira