hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3832) Resource Localization fails on a cluster due to existing cache directories
Date Fri, 19 Jun 2015 13:41:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593424#comment-14593424

Jason Lowe commented on YARN-3832:

It looks like the state store became out-of-sync with the local filesystem state.  Can you
look back in the NM logs to see when /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
was originally created?  Was the state store re-created or the disk declared bad/full in-between
the creation of that directory and the error?  Seems like something would have had to go wrong
with either storing the state or deleting the cache entry on the local disk for this to occur.

> Resource Localization fails on a cluster due to existing cache directories
> --------------------------------------------------------------------------
>                 Key: YARN-3832
>                 URL: https://issues.apache.org/jira/browse/YARN-3832
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.0
>            Reporter: Ranga Swamy
>            Assignee: Brahma Reddy Battula
>  *We have found resource localization fails on a cluster with following error.* 
> Got this error in hadoop-2.7.0 release which was fixed in 2.6.0 (YARN-2624)
> {noformat}
> Application application_1434703279149_0057 failed 2 times due to AM Container for appattempt_1434703279149_0057_000002
exited with exitCode: -1000
> For more detailed output, check application tracking page:http://S0559LDPag68:45020/cluster/app/application_1434703279149_0057Then,
click on links to logs of each attempt.
> Diagnostics: Rename cannot overwrite non empty destination directory /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
> java.io.IOException: Rename cannot overwrite non empty destination directory /opt/hdfsdata/HA/nmlocal/usercache/root/filecache/39
> at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:735)
> at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:244)
> at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
> at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:366)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Failing this attempt. Failing the application.
> {noformat}

This message was sent by Atlassian JIRA

View raw message