mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAHOUT-593) Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure compatibility with current Mahout dependencies.
Date Tue, 01 Mar 2011 22:53:37 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001165#comment-13001165
] 

Sean Owen commented on MAHOUT-593:
----------------------------------

You can't recover from failing to close a stream, but if you want to fail your processing
if a stream can't be closed, OK. Is that wise?

I don't understand the point about releasing a ref or ordering -- it is not different in this
respect from the existing method. It iterates in order and keeps no references. There is also
nothing that close()es more than once in either version -- and neither version prevents a
caller from doing something like that.

For those reasons I'm still puzzled on this piece of code but think it's minor enough I wouldn't
push more on it. If it makes sense for you and nobody else has thoughts, it's fine.

> Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure compatibility
with current Mahout dependencies.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-593
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-593
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.4
>            Reporter: Dmitriy Lyubimov
>             Fix For: 0.5
>
>         Attachments: MAHOUT-593.patch.gz, MAHOUT-593.patch.gz, MAHOUT-593.patch.gz, SSVD-givens-CLI.pdf,
ssvdclassdiag.png
>
>
> Current Mahout-376 patch requries 'new' hadoop API.  Certain elements of that API (namely,
multiple outputs) are not available in standard hadoop 0.20.2 release. As such, that may work
only with either CDH or 0.21 distributions. 
>  In order to bring it into sync with current Mahout dependencies, a backport of the patch
to 'old' API is needed. 
> Also, some work is needed to resolve math dependencies. Existing patch relies on apache
commons-math 2.1 for eigen decomposition of small matrices. This dependency is not currently
set up in the mahout core. So, certain snippets of code are either required to go to mahout-math
or use Colt eigen decompositon (last time i tried, my results were mixed with that one. It
seems to produce results inconsistent with those from mahout-math eigensolver, at the very
least, it doesn't produce singular values in sorted order).
> So this patch is mainly moing some Mahout-376 code around.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message