Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9C5FF18BB1 for ; Thu, 11 Feb 2016 04:39:18 +0000 (UTC) Received: (qmail 75012 invoked by uid 500); 11 Feb 2016 04:39:18 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 74957 invoked by uid 500); 11 Feb 2016 04:39:18 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 74925 invoked by uid 99); 11 Feb 2016 04:39:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Feb 2016 04:39:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 226CE2C1F58 for ; Thu, 11 Feb 2016 04:39:18 +0000 (UTC) Date: Thu, 11 Feb 2016 04:39:18 +0000 (UTC) From: "Vinayakumar B (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-9787) SNNs stop uploading FSImage to ANN once isPrimaryCheckPointer changed to false. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-9787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142240#comment-15142240 ] Vinayakumar B commented on HDFS-9787: ------------------------------------- {quote}The original solution was an attempting to catch the case where we don't flood the NN with checkpoint requests. Instead, maybe the better solution would be to do a small RPC to see when the latest image was uploaded. If it was uploaded the quietMultiplier beyond the checkpoint period, then we attempt to upload the checkpoint. Its a bit more work, but I think this more clearly lays out the intentions in the code, rather than obtaining the same effect, but without the overhead of actually sending the checkpoint along each time we want to find out if its behind.{quote} Yes, thats required to optimize the current approach. But I feel could be done in follow-up Jira, First lets fix the current bug. Agree? So, I see that patch fixes the issue mentioned in this Jira. +1 for the patch, > SNNs stop uploading FSImage to ANN once isPrimaryCheckPointer changed to false. > ------------------------------------------------------------------------------- > > Key: HDFS-9787 > URL: https://issues.apache.org/jira/browse/HDFS-9787 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha > Affects Versions: 3.0.0 > Reporter: Guocui Mi > Assignee: Guocui Mi > Attachments: HDFS-9786-v000.patch > > > SNNs stop uploading FSImage to ANN once isPrimaryCheckPointer become false. > Here is the logic to check if upload FSImage or not. > In StandbyCheckpointer.java > boolean sendRequest = isPrimaryCheckPointer || secsSinceLast >= checkpointConf.getQuietPeriod(); > doCheckpoint(sendRequest); > The sendRequest is always false if isPrimaryCheckPointer is false giving secsSinceLast (~checkpointPeriod) >= checkpointConf.getQuietPeriod() (checkpointPeriod * this.quietMultiplier(default value 1.5)) always returns false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)