Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B73A9184CD for ; Thu, 22 Oct 2015 01:23:28 +0000 (UTC) Received: (qmail 4997 invoked by uid 500); 22 Oct 2015 01:23:28 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 4814 invoked by uid 500); 22 Oct 2015 01:23:28 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 4796 invoked by uid 99); 22 Oct 2015 01:23:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Oct 2015 01:23:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id F16812C1F7D for ; Thu, 22 Oct 2015 01:23:27 +0000 (UTC) Date: Thu, 22 Oct 2015 01:23:27 +0000 (UTC) From: "Uma Maheswara Rao G (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (HDFS-9198) Coalesce IBR processing in the NN MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-9198?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14968= 317#comment-14968317 ]=20 Uma Maheswara Rao G edited comment on HDFS-9198 at 10/22/15 1:22 AM: --------------------------------------------------------------------- Thank Daryn for the Nice work here. This is interesting to me. I have just review the patch. Following are my comments: # runBlockOp: how about naming it as runBlockReportOp ? # nit: {code} while (namesystem.isRunning()) { + NameNodeMetrics metrics =3D NameNode.getNameNodeMetrics(); {code} May be we can take metrics outside loop and use it? # I think we need to handle throwable for this BR processing thread? incase= of any unexpected errors, this thread should not die silently as its one o= f the important processing thread=E2=80=A6 ? we may have to terminate the s= ystem in such cases. minor suggestion: method names in BM could be like runBlockReportOpSync and= runBlockReportAsync ?=20 # code format missed for this lines: {code} metrics.setBlockOpsQueued(queue.size()+1); metrics.addBlockOpsBatched(processed-1); {code} # Currently DN sets the flag to trigger sendImmediateIBR on failure of IBR = processing. But now we handle Exceptions as NN itself and can not pass to D= N as due to async. So now we sendImmdeiateIBR happens only for IPC level ex= ceptions. Have you thought about it. Missing such info would have to wait u= ntil next BR right? # Tests looking great to me. minor suggestion is could you please add javad= oc for tests? was (Author: umamaheswararao): Thank Daryn for the Nice work here. This is interesting to me. I have just review the patch. Following are my comments: # runBlockOp: how about naming it as runBlockReportOp ? # nit: {code} while (namesystem.isRunning()) { + NameNodeMetrics metrics =3D NameNode.getNameNodeMetrics(); {code} May be we can take metrics outside loop and use it? # I think we need to handle throwable for this BR processing thread? incase= of any unexpected errors, this thread should not die silently as its one o= f the important processing thread=E2=80=A6 ? we may have to terminate the s= ystem in such cases. minor suggestion: method names in BM could be like runBlockReportOpSync and= runBlockReportAsync ?=20 # code format missed for this lines: {code} metrics.setBlockOpsQueued(queue.size()+1); metrics.addBlockOpsBatched(processed-1); {code} # Currently DN sets the flag to trigger sendImmediateIBR on failure of IBR = processing. But now we handle Exceptions as NN itself and can not pass to D= N as due to async. So now we sendImmdeiateIBR happens only for IPC level ex= ceptions. Have you thought about it. Missing such info would have to wait u= ntil next BR right? # Tests looking great to me. minor suggestion is could you please ass javad= oc for tests? > Coalesce IBR processing in the NN > --------------------------------- > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.0.0-alpha > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, HDFS= -9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to = excessive write-lock contention from multiple IPC handler threads. The IBR= processing is quick, so the lock contention may be reduced by coalescing m= ultiple IBRs into a single write-lock transaction. The handlers will also = be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)