hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sravankorumilli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1887) If DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is written to the storage file. The further restarts of the DataNode, an EOFException will be thrown while reading the storage file.
Date Thu, 05 May 2011 15:07:03 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029357#comment-13029357
] 

sravankorumilli commented on HDFS-1887:
---------------------------------------

Solution:-I have fixed this in the corresponding way by catching the EOFException in the method
DataStorage.isConversionNeeded and deleting the file and returning false.
Then the data node will be started successfully and there wont be any data loss also.I have
tested this this looks fine for me.I can provide the patch or am I missing any point anywhere?

One More Scenario:-This problem will also come under normal data node restarts and if the
storage file is not present then it will try to recreate the file, so before writing the the
LAYOUT_VERSION if data node is killed then further restarts will be failing in the similar
fashion.

> If DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is written
to the storage file. The further restarts of the DataNode, an EOFException will be thrown
while reading the storage file. 
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1887
>                 URL: https://issues.apache.org/jira/browse/HDFS-1887
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20.1, 0.21.0, 0.23.0
>         Environment: Linux
>            Reporter: sravankorumilli
>            Priority: Minor
>
> Assume DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is
written to the storage file. The further restarts of the DataNode, an EOFException will be
thrown while reading the storage file. The DataNode cannot be restarted successfully until
the 'data.dir' is deleted.
> These are the corresponding logs:-
> 2011-05-02 19:12:19,389 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.EOFException
> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725)
> at org.apache.hadoop.hdfs.server.datanode.DataStorage.isConversionNeeded(DataStorage.java:203)
> at org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:697)
> at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:62)
> at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:476)
> at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:116)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:336)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:260)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:237)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1440)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1393)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1407)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1552)
> Our Hadoop cluster is managed by a cluster management software which tries to eliminate
any manual intervention in setting up & managing the cluster. But in the above mentioned
scenario, it requires manual intervention to recover the DataNode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message