Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5C3E810388 for ; Thu, 31 Oct 2013 23:28:19 +0000 (UTC) Received: (qmail 35024 invoked by uid 500); 31 Oct 2013 23:28:19 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 34962 invoked by uid 500); 31 Oct 2013 23:28:19 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 34952 invoked by uid 99); 31 Oct 2013 23:28:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 Oct 2013 23:28:19 +0000 Date: Thu, 31 Oct 2013 23:28:19 +0000 (UTC) From: "Andrew Wang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-5037) Active NN should trigger its own edit log rolls MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5037: ------------------------------ Attachment: hdfs-5037-3.patch Thanks for the review ATM. New patch which addresses your comments about a 2x multiplier and InterruptedException. I also realized I was using the wrong default for the check interval, so fixed that too. > Active NN should trigger its own edit log rolls > ----------------------------------------------- > > Key: HDFS-5037 > URL: https://issues.apache.org/jira/browse/HDFS-5037 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, namenode > Affects Versions: 3.0.0, 2.1.0-beta > Reporter: Todd Lipcon > Assignee: Andrew Wang > Priority: Critical > Attachments: hdfs-5037-1.patch, hdfs-5037-2.patch, hdfs-5037-3.patch > > > We've seen cases where the SBN/2NN went down, and then users accumulated very very large edit log segments. This causes a slow startup time because the last edit log segment must be read fully to recover it before the NN can start up again. Additionally, in the case of QJM, it can trigger timeouts on recovery or edit log syncing because the very-large segment has to get processed within a certain time bound. > We could easily improve this by having the NN trigger its own edit log rolls on a configurable size (eg every 256MB) -- This message was sent by Atlassian JIRA (v6.1#6144)