hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philip Zeyliger (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-2914) extend DistributedCache to work locally (LocalJobRunner)
Date Sun, 14 Jun 2009 22:23:07 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Philip Zeyliger updated HADOOP-2914:

    Attachment: HADOOP-2914-v2.patch

bq. In DistributedCacheHandle the class doc should go before the class declaration, not at
the beginning of the file. Also need to add Apache license.


bq. Use an enum rather than a boolean for isArchive in CacheFile.


bq. We shouldn't remove public methods to DistributedCache, but rather deprecate them and
remove them in a future release. Can DistributedCache delegate to DistributedCacheManager?
I like the fact you have documented the intended audience for each public method of DistributedCache.
(This paves the way to separating the public and private interfaces in future.)


My current thinking on APIs (for a future JIRA) is that users should access DistributedCache
through Job.addToCache(URI, flags) and Context.getCachedFiles().  But there's some more work
to get there.

bq. Is there duplication between TestMRWithDistributedCache and tests that use MRCaching that
could be avoided?

Probably, but it's hard to tease out.  MRCaching is more complicated than the test I'm adding,
and does, I believe, test some things that I don't.  On the other hand, TestMRWithDistributedCache
tests the classpath stuff.  I'm loath to delete tests too eagerly.

bq. Could TestMRWithDistributedCache also test symlinking?

It does now test symlinking.  However, I couldn't (easily) get LocalJobRunner to do symlinks
appropriately.  LocalJobRunner doesn't currently have a notion of task directory, and I think
this patch is already quite large.

> extend DistributedCache to work locally (LocalJobRunner)
> --------------------------------------------------------
>                 Key: HADOOP-2914
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2914
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: sam rash
>            Assignee: Philip Zeyliger
>            Priority: Minor
>         Attachments: HADOOP-2914-v1-full.patch, HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch
> The DistributedCache does not work locally when using the outlined recipe at http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html

> Ideally, LocalJobRunner would take care of populating the JobConf and copying remote
files to the local file sytem (http, assume hdfs = default fs = local fs when doing local

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message