hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amareshwari Sriramadasu <amar...@yahoo-inc.com>
Subject Re: Task tracker archive contains too many files
Date Wed, 04 Feb 2009 11:25:44 GMT
Andrew wrote:
> I've noticed that task tracker moves all unpacked jars into 
> ${hadoop.tmp.dir}/mapred/local/taskTracker.
>
> We are using a lot of external libraries, that are deployed via "-libjars" 
> option. The total number of files after unpacking is about 20 thousands.
>
> After running a number of jobs, tasks start to be killed with timeout reason 
> ("Task attempt_200901281518_0011_m_000173_2 failed to report status for 601 
> seconds. Killing!"). All killed tasks are in "initializing" state. I've 
> watched the tasktracker logs and found such messages:
>
>
> Thread 20926 (Thread-10368):
>   State: BLOCKED
>   Blocked count: 3611
>   Waited count: 24
>   Blocked on java.lang.ref.Reference$Lock@e48ed6
>   Blocked by 20882 (Thread-10341)
>   Stack:
>     java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
>     java.lang.StringCoding.encode(StringCoding.java:272)
>     java.lang.String.getBytes(String.java:947)
>     java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
>     java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
>     java.io.File.isDirectory(File.java:754)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:427)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>     org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
>
>
> This is exactly as in HADOOP-4780. 
> As I understand, patch brings the code, which stores map of directories along 
> with their DU's, thus reducing the number of calls to DU. This must help but 
> the process of deleting 20000 files taks too long. I've manually deleted 
> archive after 10 jobs had run and it took over 30 minutes on XFS. Three times 
> more, that default timeout for tasks!
>
> Is there is the way to prohibit unpacking of jars? Or at least not to hold the 
> archive? Or any other better way to solve this problem?
>
> Hadoop version: 0.19.0.
>
>
>   
Now, there is no way to stop DistributedCache from stopping unpacking of 
jars. I think it should have an option (thru configuration) whether to 
unpack or not.
Can you raise a jira for the same?

Thanks
Amareshwari

Mime
View raw message