Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2A4D8DBB3 for ; Thu, 15 Nov 2012 11:42:14 +0000 (UTC) Received: (qmail 36259 invoked by uid 500); 15 Nov 2012 11:42:13 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 36217 invoked by uid 500); 15 Nov 2012 11:42:13 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 36183 invoked by uid 99); 15 Nov 2012 11:42:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Nov 2012 11:42:13 +0000 Date: Thu, 15 Nov 2012 11:42:13 +0000 (UTC) From: "Uma Maheswara Rao G (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <599398009.118754.1352979733106.JavaMail.jiratomcat@arcas> In-Reply-To: <1327223501.43856.1351587733066.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (HDFS-4130) BKJM: The reading for editlog at NN starting using bkjm is not efficient MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-4130: -------------------------------------- Summary: BKJM: The reading for editlog at NN starting using bkjm is not efficient (was: The reading for editlog at NN starting using bkjm is not efficient) > BKJM: The reading for editlog at NN starting using bkjm is not efficient > ------------------------------------------------------------------------- > > Key: HDFS-4130 > URL: https://issues.apache.org/jira/browse/HDFS-4130 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, performance > Affects Versions: 3.0.0, 2.0.2-alpha > Reporter: Han Xiao > Attachments: HDFS-4130.patch, HDFS-4130-v2.patch > > > Now, the method of BookKeeperJournalManager.selectInputStreams is written like: > while (true) { > EditLogInputStream elis; > try { > elis = getInputStream(fromTxId, inProgressOk); > } catch (IOException e) { > LOG.error(e); > return; > } > if (elis == null) { > return; > } > streams.add(elis); > if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) { > return; > } > fromTxId = elis.getLastTxId() + 1; > } > > EditLogInputstream is got from getInputStream(), which will read the ledgers from zookeeper in each calling. > This will be a larger cost of times when the the number ledgers becomes large. > The reading of ledgers from zk is not necessary for every calling of getInputStream(). > The log of time wasting here is as follows: > 2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times > 2012-10-30 16:49:24,643 INFO hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Successfully connected to bookie: /167.52.1.121:318 > The stack of the process when blocking between the two lines of log is like: > "main" prio=10 tid=0x000000004011f000 nid=0x39ba in Object.wait() \[0x00007fca020fe000\] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:485) > at hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253) > \- locked <0x00000006fb8495a8> (a hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet) > at hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129) > at org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501) > at hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160) > at org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113) > at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725) > at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442) > at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480) > > betweent different time, the diff of stack is: > diff stack stack2 > 1c1 > < 2012-10-30 16:44:53 > --- > > 2012-10-30 16:46:17 > 106c106 > < - locked <0x00000006fb8495a8> (a hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet) > --- > > - locked <0x00000006fae58468> (a hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet) > In our environment, the waiting time could even reach to tens of minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira