hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiaoqiao He (Jira)" <j...@apache.org>
Subject [jira] [Created] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature
Date Sat, 11 Jan 2020 11:50:00 GMT
Xiaoqiao He created HDFS-15113:

             Summary: Missing IBR when NameNode restart if open processCommand async feature
                 Key: HDFS-15113
                 URL: https://issues.apache.org/jira/browse/HDFS-15113
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
            Reporter: Xiaoqiao He
            Assignee: Xiaoqiao He

Recently, I meet one case that NameNode missing block after restart which is related with
a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode when receive
some RPC request from DataNode.
b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister async.
  void reRegister() throws IOException {
    if (shouldRun()) {
      // re-retrieve namespace info to make sure that, if the NN
      // was restarted, we still match its version (HDFS-2120)
      NamespaceInfo nsInfo = retrieveNamespaceInfo();
      // and re-register
      // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
      // for sometime.
      if (state == HAServiceState.STANDBY || state == HAServiceState.OBSERVER) {
c. As we know, #register will trigger BR immediately.
d. because #reRegister run async, so we could not make sure which one run first between send
FBR and clear IBR. If clean IBR run first, it will be OK. But if send FBR first then clear
IBR, it will missing some blocks received between these two time point until next FBR.

This message was sent by Atlassian Jira

To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

View raw message