hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Himanshu Vashishtha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6550) Refactoring ReplicationSink to make it more responsive of cluster health
Date Thu, 09 Aug 2012 21:39:19 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432185#comment-13432185

Himanshu Vashishtha commented on HBASE-6550:

I see :)

I will be glad to make it more simpler. But, its not that difficult...  :P
It basically adds two things: bailout mechanism; and to achieve it, use Callable to submit
in a RepSink#threadpool.

I wanted to have the bailout functionality for the regionserver handler as part of the patch.
With this, it gives the opportunity to do cleanup etc in case client goes away. Decorating
config solves half the purpose. 
Another way is making similar changes at the master cluster regionserver side (decorating
its config with a lower rpc timeout etc, but that's not desirable as its not intra-cluster
and we want to give a full try before resending the shipment).

bq. Create an unmanaged HConnectionImplementation and an Executor
You mean at class level? In case another master cluster regionserver calls the method via
another handler, it will wait then?
Or at method level? 

bq.For each batch create new HTable(connection, executor)
apply batch
close create HTable.

Yes, it also happens in the current patch. It closes out the connection, and htable's pool
after the batch op.

> Refactoring ReplicationSink to make it more responsive of cluster health
> ------------------------------------------------------------------------
>                 Key: HBASE-6550
>                 URL: https://issues.apache.org/jira/browse/HBASE-6550
>             Project: HBase
>          Issue Type: New Feature
>          Components: replication
>            Reporter: Himanshu Vashishtha
>            Assignee: Himanshu Vashishtha
>         Attachments: HBase-6550-v1.patch
> ReplicationSink replicates the WALEdits in the local cluster. It uses native HBase client
to insert the mutations. Sometime, it takes a while to process it (may be due to region splitting,
gc pause, etc) and it undergoes the retrial phase. 
> It has two repercussions:
> a) The regionserver handler which is serving the request (till now, a priority handler)
is blocked for this period.
> b) The caller may get timed out and it will retry it anyway, but the handler serving
the ReplicationSink requests is still working.
> Refactoring ReplicationSink to have the following features:
> a) Making it more configurable (have its own number of retrial limit, connection timeout,
> b) Add a fail fast behavior so that it bails out in case caller is timedout, or any exception
in processing the mutation batch.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message