hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Bradley (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-2632) existing in_use.lock file is removed after failing to lock this file
Date Mon, 05 Dec 2011 19:28:39 GMT
existing in_use.lock file is removed after failing to lock this file

                 Key: HDFS-2632
                 URL: https://issues.apache.org/jira/browse/HDFS-2632
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: 0.21.0
         Environment: Scientific Linux 5.3
            Reporter: Dan Bradley

If an attempt is made to start the namenode when it is already running, an exception is generated
on failure to lock in_use.lock.  However, there is a bug: in_use.lock is deleted!  After that,
if another attempt is made to start the namenode, there is no in_use.lock file, so the new
instance goes ahead and starts messing with the namenode state files.  It eventually fails
to bind to the TCP port, but it has already done damage by that time.  Specifically, the 'edits'
file being written to by the running instance is moved to 'previous.checkpoint' so all further
transactions are lost when the HDFS service is next restarted.  We observed a case of data
loss because of this.

This issue relates to HDFS-1690, but the problem in HDFS-1690 was stated in a way that is
specific to -format.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message