hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut O. Hellan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-527) Local filecache mkdir fails
Date Thu, 04 Apr 2013 07:13:15 GMT

    [ https://issues.apache.org/jira/browse/YARN-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621878#comment-13621878
] 

Knut O. Hellan commented on YARN-527:
-------------------------------------

Yes, this is a duplicate of YARN-467 so you may close it. We will add cronjobs to delete old
directories as a temporary workaround until we can test 2.0.5-beta. Thanks!
                
> Local filecache mkdir fails
> ---------------------------
>
>                 Key: YARN-527
>                 URL: https://issues.apache.org/jira/browse/YARN-527
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.0-alpha
>         Environment: RHEL 6.3 with CDH4.1.3 Hadoop, HA with two name nodes and six worker
nodes.
>            Reporter: Knut O. Hellan
>            Priority: Minor
>         Attachments: yarn-site.xml
>
>
> Jobs failed with no other explanation than this stack trace:
> 2013-03-29 16:46:02,671 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
Diag
> nostics report from attempt_1364591875320_0017_m_000000_0: java.io.IOException: mkdir
of /disk3/yarn/local/filecache/-42307893
> 55400878397 failed
>         at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:932)
>         at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>         at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>         at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
>         at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
>         at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2333)
>         at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
>         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
>         at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Manually creating the directory worked. This behavior was common to at least several
nodes in the cluster.
> The situation was resolved by removing and recreating all /disk?/yarn/local/filecache
directories on all nodes.
> It is unclear whether Yarn struggled with the number of files or if there were corrupt
files in the caches. The situation was triggered by a node dying.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message