hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-527) Local filecache mkdir fails
Date Tue, 02 Apr 2013 20:25:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620239#comment-13620239
] 

Vinod Kumar Vavilapalli commented on YARN-527:
----------------------------------------------

Is there any difference in how NodeManager tried to create the dir and your manual creation?
Like the user running NM and user who manually created the dir? Can you reproduce this? If
we can find out exactly why NM couldn't create it automatically, then we can do something
about it.
                
> Local filecache mkdir fails
> ---------------------------
>
>                 Key: YARN-527
>                 URL: https://issues.apache.org/jira/browse/YARN-527
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.0-alpha
>         Environment: RHEL 6.3 with CDH4.1.3 Hadoop, HA with two name nodes and six worker
nodes.
>            Reporter: Knut O. Hellan
>            Priority: Minor
>         Attachments: yarn-site.xml
>
>
> Jobs failed with no other explanation than this stack trace:
> 2013-03-29 16:46:02,671 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
Diag
> nostics report from attempt_1364591875320_0017_m_000000_0: java.io.IOException: mkdir
of /disk3/yarn/local/filecache/-42307893
> 55400878397 failed
>         at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:932)
>         at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>         at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>         at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>         at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>         at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2333)
>         at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Manually creating the directory worked. This behavior was common to at least several
nodes in the cluster.
> The situation was resolved by removing and recreating all /disk?/yarn/local/filecache
directories on all nodes.
> It is unclear whether Yarn struggled with the number of files or if there were corrupt
files in the caches. The situation was triggered by a node dying.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message