hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.
Date Wed, 24 Aug 2011 11:53:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090147#comment-13090147
] 

ramkrishna.s.vasudevan commented on HBASE-4138:
-----------------------------------------------

@Ted,
I debugged and arrived at some points about test failure.  Pls check and correct me if my
analysis is wrong.
-> In all the failure scenarios we can see that the just before the exception has occured
a new connection was formed.  The test cases invoke new HTable(), in which it flows to 
{code}HConnectionManager.getConnection(conf);{code}
-> now a new connection is retrieved. The new zookeeper connection tries to watch the master
and root region server node.(MasterAddressTracker.start() and RootRegionTracker().start()
-> In ZKUtil.watchAndCheckExists() api 
{code}
      Stat s = zkw.getRecoverableZooKeeper().exists(znode, zkw);
      LOG.debug(zkw.prefix("Set watcher on existing znode " + znode));
      return s != null ? true : false;
{code}
We were printing the log msg and then returning.  If you see the failure logs this znode has
the proper value like /hbase/master. Now if this had returned true, the next step 
in start() api will be to get the data
{code}byte [] data = ZKUtil.getDataAndWatch(watcher, node);{code}
But if there had been some data then the log
{code}
LOG.debug(zkw.prefix("Retrieved " + ((data == null)? 0: data.length) +
{code} 
should be present but it is not present and there are no exceptions also.
So ideally what has happened is 
{code}ZKUtil.watchAndCheckExists(){code} has returned false.  This api will return false when
the node does not exist.
Now what we need to know is in what scenario the node /hbase itself will get deleted and also
what made the new HTable() to create a new connection.  (May be the connection got deleted.)
One more thing we need to add is in HConnectionManager.setupZookeeperTrackers() 
{code}    masterAddressTracker.start(){code}
if he is not able to establish watch he should throw error.  Correct me if am wrong. 


> If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object
loops continuously waiting for the root region by using /hbase as the base node.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4138
>                 URL: https://issues.apache.org/jira/browse/HBASE-4138
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.90.3
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0
>
>         Attachments: HBASE-4138_trunk_1.patch, HBASE-4138_trunk_2.patch, HBASE-4138_trunk_3.patch
>
>
> Change the zookeeper.znode.parent property (default is /hbase).
> Now do not specify this change in the client code.
> Use the HTable Object.
> The HTable is not able to find the root region and keeps continuously looping.
> Find the stack trace:
> ====================
> Object.wait(long) line: not available [native method]		 
> RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122
> RootRegionTracker.waitRootRegionLocation(long) line: 73		 
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 578
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
> byte[], byte[], boolean, Object) line: 687
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 589
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
> byte[], byte[], boolean, Object) line: 687
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 593
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HTable.<init>(Configuration, byte[]) line: 171		 
> HTable.<init>(Configuration, String) line: 145		 
> HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message