hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4130) BKJM: The reading for editlog at NN starting using bkjm is not efficient
Date Thu, 15 Nov 2012 15:20:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498074#comment-13498074
] 

Hadoop QA commented on HDFS-4130:
---------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12553657/HDFS-4130-v3.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3519//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3519//console

This message is automatically generated.
                
> BKJM: The reading for editlog at NN starting using bkjm  is not efficient
> -------------------------------------------------------------------------
>
>                 Key: HDFS-4130
>                 URL: https://issues.apache.org/jira/browse/HDFS-4130
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, performance
>    Affects Versions: 3.0.0, 2.0.2-alpha
>            Reporter: Han Xiao
>         Attachments: HDFS-4130.patch, HDFS-4130-v2.patch, HDFS-4130-v3.patch
>
>
> Now, the method of BookKeeperJournalManager.selectInputStreams is written like:
>     while (true) {
>       EditLogInputStream elis;
>       try {
>         elis = getInputStream(fromTxId, inProgressOk);
>       } catch (IOException e) {
>         LOG.error(e);
>         return;
>       }
>       if (elis == null) {
>         return;
>       }
>       streams.add(elis);
>       if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
>         return;
>       }
>       fromTxId = elis.getLastTxId() + 1;
>     }
>  
> EditLogInputstream is got from getInputStream(), which will read the ledgers from zookeeper
in each calling.
> This will be a larger cost of times when the the number ledgers becomes large.
> The reading of ledgers from zk is not necessary for every calling of getInputStream().
> The log of time wasting here is as follows:
> 2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching
file names occuring more than 10 times
> 2012-10-30 16:49:24,643 INFO hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient:
Successfully connected to bookie: /167.52.1.121:318
> The stack of the process when blocking between the two lines of log is like:
> "main" prio=10 tid=0x000000004011f000 nid=0x39ba in Object.wait() \[0x00007fca020fe000\]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:485)
>         at hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
>         \- locked <0x00000006fb8495a8> (a hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
>         at hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
>         at org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
>         at hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
>         at org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
>         at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
>         at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
>         at org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)
>         
> betweent different time, the diff of stack is:
> diff stack stack2
> 1c1
> < 2012-10-30 16:44:53
> ---
> > 2012-10-30 16:46:17
> 106c106
> <       - locked <0x00000006fb8495a8> (a hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> ---
> >       - locked <0x00000006fae58468> (a hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> In our environment, the waiting time could even reach to tens of minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message