hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gang Luo <lgpub...@yahoo.com.cn>
Subject Re: where distributed cache start working
Date Thu, 26 Aug 2010 04:59:10 GMT
Thanks Arun. Change the mTime is a good idea. However, given a file (the path is 

A/B/C/D/file) distributed to all the nodes, if I just change the mTime of file 
to a earlier time stamp, it will not be replaced next time. Should I also change 
the mTime for all the directories along the path (A, B, C and D). Whose 
timestamp is used by DistributedCache?

Thanks.
-Gang




----- 原始邮件 ----
发件人: Arun C Murthy <acm@yahoo-inc.com>
收件人: mapreduce-user@hadoop.apache.org
发送日期: 2010/8/22 (周日) 9:38:02 下午
主   题: Re: where distributed cache start working

Moving to mapreduce-user@, bcc common-dev@. Please use the project specific 
lists.

DistributedCache.purgeCache isn't a public api. You shouldn't be calling it from 

the task.

A simple way of doing what you want is to change the mtime of the cache files on 

HDFS.

Arun

On Aug 22, 2010, at 9:48 AM, Gang Luo wrote:

> Thanks Jeff.
> 
> However, are you sure TaskRunner.run() is also used in the new API? I use 
>btrace
> to trace the function call but didn't find this function had been called
> anywhere.
> 
> 
> One more question about distributed cache. After I call
> DistributedCache.purgeCache, I think the local cached files should be deleted 
>or
> invalidated. However ,When I run the same job with the purge operation at the
> end multiple times, I find the local files have never been deleted and the
> modification time is when the first job run. How can I ask my job to
> re-distributed the cache again anyway?
> 
> Thanks,
> -Gang
> 
> 
> 
> 
> ----- 原始邮件 ----
> 发件人: Jeff Zhang <zjffdu@gmail.com>
> 收件人: common-dev@hadoop.apache.org
> 发送日期: 2010/8/20 (周五) 11:22:49 上午
> 主   题: Re: where distributed cache start working
> 
> Hi Gang,
> 
> In the TaskRunner's run() method, hadoop will download the cache files
> which you set on the client side to local, then the forked child jvm
> can use these cache files locally.
> 
> 
> 
> On Fri, Aug 20, 2010 at 8:08 AM, Gang Luo <lgpublic@yahoo.com.cn> wrote:
>> Hi all,
>> I go through the code, but couldn't find the place where distributed cache
>> start
>> working. I want to know between DistriubtedCache.addCacheFile at the master
>> node
>> and DistributedCache.getLocalCacheFiles at the client side, when and where 
are
>> the files get distributed.
>> 
>> 
>> Thanks,
>> -Gang
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> --Best Regards
> 
> Jeff Zhang
> 
> 
> 
>



      

Mime
View raw message