hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philippe Gassmann (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-760) HDFS edits log file corrupted can lead to a major loss of data.
Date Wed, 29 Nov 2006 09:47:21 GMT
HDFS edits log file corrupted can lead to a major loss of data.
---------------------------------------------------------------

                 Key: HADOOP-760
                 URL: http://issues.apache.org/jira/browse/HADOOP-760
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.6.1
            Reporter: Philippe Gassmann
            Priority: Critical


In one of our test system, our HDFS gets corrupted after the edits log file has been corrupted
(i can tell how).

When we restarted the HDFS, the namenode refusses to started with a exception in hadoop-namenode-xxx.out.

Unfortunately, a rm mistake has been done, and I was not able to save somewhere this exception.


But it was an ArrayIndexOutOfBoundException somewhere in a UTF8 method called from FSEditLog.loadFSEdits.

The result : the namenode was unable to start, the only way to get it fixed was the removing
of the edits log file.

As it was on a test machine we do not have any backup, so all files created in the hdfs since
the last start of the namenode were lost.

Is there a way to periodically commit changes to the hdfs in fsimage instead of keeping a
huge logfile ? (eg every 10 minutes or so.)

Even if the namenode files are rsync'ed, what can be done in that particular case ? (if we
periodically rsync the fsimage and its corrupted edits file).

This issue affects the 0.6.1 HDFS version. After looking at the hadoop trunk code, I am not
able to says if this can be happening anymore... (I would say yes because of the use of UTF8
class in the same way as in 0.6.1)




-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message