hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Azuryy(Chijiong) (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-3323) Distributed Cache for Map or Reduce or Both
Date Tue, 01 Nov 2011 08:46:32 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Azuryy(Chijiong) updated MAPREDUCE-3323:
----------------------------------------

    Attachment: dc.patch
    
> Distributed Cache for Map or Reduce or Both
> -------------------------------------------
>
>                 Key: MAPREDUCE-3323
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3323
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Azuryy(Chijiong)
>         Attachments: dc.patch
>
>
> We put some file into Distributed Cache, but sometimes, only Map or Reduce use thses
cached files, not useful for both. but TaskTracker always download cached files from HDFS,
if there are some little bit big files in cache, it's time expensive.
> so, this patch add some new API in the DistributedCache.java as follow:
> addArchiveToClassPathForMap
> addArchiveToClassPathForReduce
> addFileToClassPathForMap
> addFileToClassPathForReduce
> addCacheFileForMap
> addCacheFileForReduce
> addCacheArchiveForMap
> addCacheArchiveForReduce
> New API doesn't affect original interface. but they are specified for only map or reduce,
not both of them.
> But if you do need cache file during both map and reduce, then use original interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message