hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ChenFolin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-10214) Checkpoint Can not be done by StandbyNameNode.Because checkpoint may cause DataNode blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the object num > 100000000.
Date Fri, 25 Mar 2016 05:15:25 GMT
ChenFolin created HDFS-10214:
--------------------------------

             Summary: Checkpoint Can not be done by StandbyNameNode.Because checkpoint may
cause DataNode blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the object num
> 100000000.
                 Key: HDFS-10214
                 URL: https://issues.apache.org/jira/browse/HDFS-10214
             Project: Hadoop HDFS
          Issue Type: New Feature
          Components: ha, namenode
    Affects Versions: 2.6.4, 2.5.0
         Environment: 500 DataNode.

137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s)
            Reporter: ChenFolin


The current Cluster status :
137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s).

The checkpoint save namespace cost more than 5 min.

DataNode rpc timeout.

Standby NameNode skip the DataNode rpc request(because datanode rpc timeout , datanode close
the socket channel).

There are many corrupt files when failover.

So, Checkpoint may be done by other component, not Standby NameNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message