hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sandflee (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-6854) many job failed if NM couldn't detect disk error
Date Fri, 21 Jul 2017 07:19:00 GMT
sandflee created YARN-6854:

             Summary: many job failed if NM couldn't detect disk error
                 Key: YARN-6854
                 URL: https://issues.apache.org/jira/browse/YARN-6854
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: sandflee
            Priority: Critical

checkDiskHealthy is enabled, but it couldn't find this error. leading containers failed and
new containers assigned to this node then failed again. 
the disk error seems a filesystem error, all io operation (such as ls) failed on $localdir/usercache/userFoo,
 and no effect on other dir. 
Any suggestion?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

View raw message