hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thanh Do (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-1220) Namenode unable to start due to truncated fstime
Date Thu, 17 Jun 2010 03:50:23 GMT
Namenode unable to start due to truncated fstime

                 Key: HDFS-1220
                 URL: https://issues.apache.org/jira/browse/HDFS-1220
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: 0.20.1
            Reporter: Thanh Do

- Summary: updating fstime file on disk is not atomic, so it is possible that
if a crash happens in the middle, next time when NameNode reboots, it will
read stale fstime, hence unable to start successfully.
- Details:
Below is the code for updating fstime file on disk
  void writeCheckpointTime(StorageDirectory sd) throws IOException {
    if (checkpointTime < 0L)
      return; // do not write negative time                                              
    File timeFile = getImageFile(sd, NameNodeFile.TIME);
    if (timeFile.exists()) { timeFile.delete(); }
    DataOutputStream out = new DataOutputStream(
                                                new FileOutputStream(timeFile));
    try {
    } finally {
Basically, this involve 3 steps:
1) delete fstime file (timeFile.delete())
2) truncate fstime file (new FileOutputStream(timeFile))
3) write new time to fstime file (out.writeLong(checkpointTime))
If a crash happens after step 2 and before step 3, in the next reboot, NameNode
got an exception when reading the time (8 byte) from an empty fstime file.

This bug was found by our Failure Testing Service framework:
For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
Haryadi Gunawi (haryadi@eecs.berkeley.edu

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message