hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Omkar Vinit Joshi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-467) Jobs fail during resource localization when public distributed-cache hits unix directory limits
Date Mon, 01 Apr 2013 23:19:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619304#comment-13619304
] 

Omkar Vinit Joshi commented on YARN-467:
----------------------------------------

I ran the test on Mac and got below results. I think keeping a default of 8192 would be good..

||Total Number of files || Total time taken (in millis)||
||32||4||
||64||7||
||128||15||
||256||27||
||512||60||
||1024||120||
||2048||219||
||4096||524||
||8192||1845||
||16384||7332||

I have incorporated all the comments in the latest patch.
                
> Jobs fail during resource localization when public distributed-cache hits unix directory
limits
> -----------------------------------------------------------------------------------------------
>
>                 Key: YARN-467
>                 URL: https://issues.apache.org/jira/browse/YARN-467
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0, 2.0.0-alpha
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>         Attachments: yarn-467-20130322.1.patch, yarn-467-20130322.2.patch, yarn-467-20130322.3.patch,
yarn-467-20130322.patch, yarn-467-20130325.1.patch, yarn-467-20130325.path, yarn-467-20130328.patch
>
>
> If we have multiple jobs which uses distributed cache with small size of files, the directory
limit reaches before reaching the cache size and fails to create any directories in file cache
(PUBLIC). The jobs start failing with the below exception.
> java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 failed
> 	at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
> 	at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
> 	at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
> 	at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
> 	at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
> 	at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
> 	at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
> 	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
> 	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:662)
> we need to have a mechanism where in we can create directory hierarchy and limit number
of files per directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message