mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy Lyubimov (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (MAHOUT-593) Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure compatibility with current Mahout dependencies.
Date Wed, 02 Mar 2011 01:06:37 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001215#comment-13001215
] 

Dmitriy Lyubimov edited comment on MAHOUT-593 at 3/2/11 1:05 AM:
-----------------------------------------------------------------

It's just housekeeping of the resources. It's not so much the case here, but in other situations
you might have a resource tree open. e.g. you may have something representing a filesystem
(on top), and bunch of files underneath it, plus might potentially have compressors attached
to files etc. 

if you decide to collapse (commit) that system then you need to flush (close) compressors
first, then close the files, and then potentially close a filesystem. Along those lines. Hence
the order.

Reference release has to do with the fact that you might have flushed some or all of the resources
and closed them, so you want to release reference from that resource collection (list) as
well in order for GC to collect it. This helper does it automatically. 

Finally, you do want to receive "noise" to promote it up when you commit the resource tree
if one of the resources failed to commit (close). The only exception when you probably don't
care about noise is if that's an error path already (aka "second chance exception"). 

Again, i see it is probably not so much the case in this case, but it is often the case. 

      was (Author: dlyubimov2):
    It's just housekeeping of the resources. It's not so much the case here, but in other
situations you might have a resource tree open. e.g. you may have something representing a
filesystem (on top), and bunch of files underneath it, plus might potentially have compressors
attached to files etc. 

if you decide to collapse (commit) that system then you need to flush (close) compressors
first, then close the files, and then potentially close a filesystem. Along those lines. Hence
the order.

Reference release has to do with the fact that you might have flushed some or all of the resources
and closed them, so you want to release reference from collection as well in order for GC
to collect it. This helper does it automatically. 

Finally, you do want to receive "noise" to promote it up when you commit the resource tree
if one of the resources failed to commit (close). The only exception when you probably don't
care about noise if that's an error path already. 

Again, i see it is probably not so much the case in this case, but it is often the case. 
  
> Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure compatibility
with current Mahout dependencies.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-593
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-593
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.4
>            Reporter: Dmitriy Lyubimov
>             Fix For: 0.5
>
>         Attachments: MAHOUT-593.patch.gz, MAHOUT-593.patch.gz, MAHOUT-593.patch.gz, SSVD-givens-CLI.pdf,
ssvdclassdiag.png
>
>
> Current Mahout-376 patch requries 'new' hadoop API.  Certain elements of that API (namely,
multiple outputs) are not available in standard hadoop 0.20.2 release. As such, that may work
only with either CDH or 0.21 distributions. 
>  In order to bring it into sync with current Mahout dependencies, a backport of the patch
to 'old' API is needed. 
> Also, some work is needed to resolve math dependencies. Existing patch relies on apache
commons-math 2.1 for eigen decomposition of small matrices. This dependency is not currently
set up in the mahout core. So, certain snippets of code are either required to go to mahout-math
or use Colt eigen decompositon (last time i tried, my results were mixed with that one. It
seems to produce results inconsistent with those from mahout-math eigensolver, at the very
least, it doesn't produce singular values in sorted order).
> So this patch is mainly moing some Mahout-376 code around.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message