mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1273) Single Pass Algorithm for Penalized Linear Regression on MapReduce
Date Sun, 21 Jul 2013 20:52:49 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714803#comment-13714803
] 

Ted Dunning commented on MAHOUT-1273:
-------------------------------------



Should the document be updated to describe what you intend to do?

                
> Single Pass Algorithm for Penalized Linear Regression on MapReduce
> ------------------------------------------------------------------
>
>                 Key: MAHOUT-1273
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1273
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Kun Yang
>         Attachments: PenalizedLinear.pdf
>
>   Original Estimate: 720h
>  Remaining Estimate: 720h
>
> Penalized linear regression such as Lasso, Elastic-net are widely used in machine learning,
but there are no very efficient scalable implementations on MapReduce.
> The published distributed algorithms for solving this problem is either iterative (which
is not good for MapReduce, see Steven Boyd's paper) or approximate (what if we need exact
solutions, see Paralleled stochastic gradient descent); another disadvantage of these algorithms
is that they can not do cross validation in the training phase, which requires a user-specified
penalty parameter in advance. 
> My ideas can train the model with cross validation in a single pass. They are based on
some simple observations.
> I have implemented the primitive version of this algorithm in Alpine Data Labs. Advanced
features such as inner-mapper combiner are employed to reduce the network traffic in the shuffle
phase.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message