hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmed Radwan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8705) Add JSR 107 Caching support
Date Thu, 23 Aug 2012 22:25:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440723#comment-13440723

Ahmed Radwan commented on HADOOP-8705:

Thanks Dhruv, This is interesting. I think the work on having pluggable MapOutputBuffer and
shuffle can highly facilitate such effort.
> Add JSR 107 Caching support 
> ----------------------------
>                 Key: HADOOP-8705
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8705
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Dhruv Kumar
> Having a cache on mappers and reducers could be very useful for some use cases, including
but not limited to:
> 1. Iterative Map Reduce Programs: Some machine learning algorithms frequently need access
to invariant data (see Mahout) over each iteration of MapReduce until convergence. A cache
on such nodes could allow easy access to the hotset of data without going all the way to the
distributed cache.
> 2. Storing of intermediate map and reduce outputs in memory to reduce shuffling time.
This optimization has been discussed at length in Haloop (http://www.ics.uci.edu/~yingyib/papers/HaLoop_camera_ready.pdf).
> There are some other scenarios as well where having a cache could come in handy. 
> It will be nice to have some sort of pluggable support for JSR 107 compliant caches.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message