hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nitay Joffe (JIRA)" <j...@apache.org>
Subject [jira] Assigned: (HBASE-1232) zookeeper client wont reconnect if there is a problem
Date Tue, 03 Mar 2009 00:53:56 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nitay Joffe reassigned HBASE-1232:
----------------------------------

    Assignee: Nitay Joffe  (was: Jean-Daniel Cryans)

> zookeeper client wont reconnect if there is a problem
> -----------------------------------------------------
>
>                 Key: HBASE-1232
>                 URL: https://issues.apache.org/jira/browse/HBASE-1232
>             Project: Hadoop HBase
>          Issue Type: Bug
>         Environment: java 1.7, zookeeper 3.0.1
>            Reporter: ryan rawson
>            Assignee: Nitay Joffe
>             Fix For: 0.20.0
>
>
> my regionserver got wedged:
> 2009-03-02 15:43:30,938 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed
to create /hbase:
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session
expired for /hbase
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:87)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:35)
>         at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:482)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureExists(ZooKeeperWrapper.java:219)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureParentExists(ZooKeeperWrapper.java:240)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.checkOutOfSafeMode(ZooKeeperWrapper.java:328)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:783)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:468)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:443)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:518)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:477)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:450)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:295)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocationForRowWithRetries(HConnectionManager.java:919)
>         at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:950)
>         at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1370)
>         at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:1314)
>         at org.apache.hadoop.hbase.client.HTable.commit(HTable.java:1294)
>         at org.apache.hadoop.hbase.RegionHistorian.add(RegionHistorian.java:237)
>         at org.apache.hadoop.hbase.RegionHistorian.add(RegionHistorian.java:216)
>         at org.apache.hadoop.hbase.RegionHistorian.addRegionSplit(RegionHistorian.java:174)
>         at org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:607)
>         at org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:174)
>         at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:107)
> this message repeats over and over.  
> Looking at the code in question:
>   private boolean ensureExists(final String znode) {
>     try {
>       zooKeeper.create(znode, new byte[0],
>                        Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
>       LOG.debug("Created ZNode " + znode);
>       return true;
>     } catch (KeeperException.NodeExistsException e) {
>       return true;      // ok, move on.
>     } catch (KeeperException.NoNodeException e) {
>       return ensureParentExists(znode) && ensureExists(znode);
>     } catch (KeeperException e) {
>       LOG.warn("Failed to create " + znode + ":", e);
>     } catch (InterruptedException e) {
>       LOG.warn("Failed to create " + znode + ":", e);
>     }
>     return false;
>   }
> We need to catch this exception specifically and reopen the ZK connection.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message