hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-2182) Exceptions in DataXceiver#run can result in a zombie datanode
Date Thu, 21 Jul 2011 20:14:58 GMT
Exceptions in DataXceiver#run can result in a zombie datanode 

                 Key: HDFS-2182
                 URL: https://issues.apache.org/jira/browse/HDFS-2182
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: data-node
            Reporter: Eli Collins
             Fix For: 0.23.0

DataXceiver#run currently swallows all exceptions, it should instead plumb them up to DataXceiverServer#run
so it can decide whether the exception should be tolerated or the daemon should exit. An IOE
should be tolerated (because it's likely just an issue with a particular thread, or an intermittent
failure), as it is today, but eg j.l.Error should be not. 

This came up in the following bug I'm seeing on a test cluster: if there's eg a NoClassDefFoundError
thrown in DataXceiver#run (because the host jars were replaced out from underneath it, it
ran out of descriptors, etc.) we'll end up with a datanode that is alive but always fails
because it can't create any DataXceiver threads. In this case the datanode should shut itself
down rather than continue to run.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message