Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Wed, 27 Apr 2016 06:50:13 +0000 (UTC)
From: "Brahma Reddy Battula (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12953497.1458882922000.43721.1461739813079@Atlassian.JIRA>
In-Reply-To: <JIRA.12953497.1458882922000@Atlassian.JIRA>
References: <JIRA.12953497.1458882922000@Atlassian.JIRA>
 <JIRA.12953497.1458882922210@arcas>
Subject: [jira] [Commented] (HDFS-10214) Checkpoint Can not be done by
 StandbyNameNode.Because checkpoint may cause DataNode
 blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the object
 num > 100000000.
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-10214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259655#comment-15259655 ] 

Brahma Reddy Battula commented on HDFS-10214:
---------------------------------------------

linked to duplicate jira ( HDFS-7097)..

> Checkpoint Can not be done by StandbyNameNode.Because checkpoint may cause DataNode blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the object num > 100000000.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10214
>                 URL: https://issues.apache.org/jira/browse/HDFS-10214
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, namenode
>    Affects Versions: 2.5.0, 2.6.4
>         Environment: 500 DataNode.
> 137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s)
>            Reporter: ChenFolin
>             Fix For: 2.7.2
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> The current Cluster status :
> 137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s).
> The checkpoint save namespace cost more than 5 min.
> DataNode rpc timeout.
> Standby NameNode skip the DataNode rpc request(because datanode rpc timeout , datanode close the socket channel).
> There are many corrupt files when failover.
> So, Checkpoint may be done by other component, not Standby NameNode.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)