hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sravankorumilli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-1887) If DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is written to the storage file. The further restarts of the DataNode, an EOFException will be thrown while reading the storage file.
Date Wed, 04 May 2011 16:22:03 GMT
If DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is written to
the storage file. The further restarts of the DataNode, an EOFException will be thrown while
reading the storage file. 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HDFS-1887
                 URL: https://issues.apache.org/jira/browse/HDFS-1887
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: data-node
    Affects Versions: 0.21.0, 0.20.1, 0.23.0
         Environment: Linux
            Reporter: sravankorumilli
            Priority: Minor


Assume DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is written
to the storage file. The further restarts of the DataNode, an EOFException will be thrown
while reading the storage file. The DataNode cannot be restarted successfully until the 'data.dir'
is deleted.

These are the corresponding logs:-
2011-05-02 19:12:19,389 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.EOFException
at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.isConversionNeeded(DataStorage.java:203)
at org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:697)
at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:62)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:476)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:116)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:336)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:260)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:237)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1440)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1393)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1407)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1552)

Our Hadoop cluster is managed by a cluster management software which tries to eliminate any
manual intervention in setting up & managing the cluster. But in the above mentioned scenario,
it requires manual intervention to recover the DataNode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message