hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: init HTbable got stucked in HConnectionManager$HConnectionImplementation.locateRegion.
Date Wed, 19 Dec 2012 03:39:32 GMT
Could it be due to OPERATIONTIMEOUT ?
What version of HBase are you using ?
Do you let HBase manage zookeeper ensemble ?

Cheers

On Tue, Dec 18, 2012 at 7:19 PM, 唐 颖 <ivytang0812@gmail.com> wrote:

> We have a muith-thread program to put data into base . Each thread news an
> instance of a HTable ,because they put data into different HTable.
>
> But today we find that this program is stucked. After we stack this java
> process,we found that one thread is stucked in
>
> "pool-1-thread-9" prio=10 tid=0x00007fbb14036800 nid=0x4f7a waiting on
> condition [0x00007fbb5d411000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at java.lang.Thread.sleep(Thread.java:302)
>         at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:328)
>         at
> org.apache.hadoop.hbase.util.RetryCounter.sleepUntilNextRetry(RetryCounter.java:54)
>         at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:277)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:522)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:498)
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.getData(ZooKeeperNodeTracker.java:156)
>         - locked <0x000000067bc07738> (a
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
>         at
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker.getRootRegionLocation(RootRegionTracker.java:62)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:821)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:801)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:933)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:832)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:801)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:933)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:801)
>         at
> org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:238)
>         at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:178)
>         at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:137)
>         at com.xingcloud.server.task.EventTask.run(EventTask.java:65)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
>
>
> And other threads are waiting this lock.
>
> "pool-1-thread-7" prio=10 tid=0x00007fbb14032800 nid=0x4f76 waiting for
> monitor entry [0x00007fbb5d493000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.getData(ZooKeeperNodeTracker.java:154)
>         - waiting to lock <0x000000067bc07738> (a
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
>         at
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker.getRootRegionLocation(RootRegionTracker.java:62)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:821)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:801)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:933)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:832)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:801)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:933)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836)
>         at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:801)
>         at
> org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:238)
>         at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:178)
>         at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:137)
>         at com.xingcloud.server.task.EventTask.run(EventTask.java:65)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>
> I checked the base code of
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:277)
>
>   public byte[] getData(String path, Watcher watcher, Stat stat)
>   throws KeeperException, InterruptedException {
>     RetryCounter retryCounter = retryCounterFactory.create();
>     while (true) {
>       try {
>         byte[] revData = zk.getData(path, watcher, stat);
>         return this.removeMetaData(revData);
>       } catch (KeeperException e) {
>         switch (e.code()) {
>           case CONNECTIONLOSS:
>           case OPERATIONTIMEOUT:
>             retryOrThrow(retryCounter, e, "getData");
>             break;
>
>           default:
>             throw e;
>         }
>       }
>       retryCounter.sleepUntilNextRetry();
>       retryCounter.useRetry();
>     }
>   }
>
> I guess the KeeperException.code is CONNECTIONLOSS ,  this error code
> causes this stucked thing happened.
>
> Why this error code is CONNECTIONLOSS?
>
> And i restart this client program ,this situation still happens. To solve
> this, must i restart HBase?
>
>
> Thanks!
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message