hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jieshan Bean (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5153) HConnection re-creation in HTable after HConnection abort
Date Fri, 13 Jan 2012 14:21:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185601#comment-13185601
] 

Jieshan Bean commented on HBASE-5153:
-------------------------------------

Thanks, Ted.

Without HBASE-3065, 0.90 doesn't handle the ConnectionLossException correctly. Consider the
below case:
1. Somewhere trigger a HConnection#abort. 
2. Suppose the check of "if (t instanceof KeeperException.SessionExpiredException)" is true.
Then called the resetZooKeeperTrackers().
3. A ConnectionLossException occur during ZookeeperNodeTracker#start. then trigger a new HConnection#abort.
At this scenario, the previous abort may print a log of 
"Reconnected successfully. This disconnect could have been caused by a network partition or
a long-running GC pause......"
4. The new abort carry a Throwable with a type which is not KeeperException.SessionExpiredException.
so this time abort directly.

It seems a recursion here.
 
Either re-use the old connection by resetZooKeeperTrackers, or re-create the connection, the
ZookeeperWatcher will be a new one. So  I still think the patch for 0.90 is reasonable.

Trunk patch will be made big changes.

So any other good suggestions? Thanks.

                
> HConnection re-creation in HTable after HConnection abort
> ---------------------------------------------------------
>
>                 Key: HBASE-5153
>                 URL: https://issues.apache.org/jira/browse/HBASE-5153
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.90.4
>            Reporter: Jieshan Bean
>            Assignee: Jieshan Bean
>             Fix For: 0.90.6
>
>         Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch,
HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch
>
>
> HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a
same connection, once this connection got abort in one thread, the other threads will got
a "HConnectionManager$HConnectionImplementation@18fb1f7 closed" exception.
> It solve the problem of "stale connection can't removed". But the orignal HTable instance
cann't be continue to use. The connection in HTable should be recreated.
> Actually, there's two aproach to solve this:
> 1. In user code, once catch an IOE, close connection and re-create HTable instance. We
can use this as a workaround.
> 2. In HBase Client side, catch this exception, and re-create connection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message