hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10504) Define Replication Interface
Date Wed, 14 May 2014 19:33:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997936#comment-13997936
] 

Gabriel Reid commented on HBASE-10504:
--------------------------------------

[~enis] [~jdcryans] I think that this would make a great addition to HBase in general -- if
I'm reading it correctly, it basically works out as a kind of RegionObserver.postPut that
runs in process, but off of the write path. However, I'm less sure of how well this general
approach would work for hbase-indexer.

Although this approach would simplify a lot of things (e.g.removing the need to to act as
a slave cluster), it also introduces a number of important changes in the execution model
of something that runs via the SepConsumer (i.e. reacting to HBase updates)
* the number of replication endpoints becomes directly linked to the number of regionservers
-- as far as I see, it would no longer be possible to have more or fewer replication endpoints
than regionservers
* the replication (ReplicationConsumer) runs within the same JVM as the regionserver, which
brings some important changes: all of the dependencies then need to be present in the classpath
of the regionserver (and compatible with the dependencies of the regionserver), it has an
effect on the heap, is no longer possible to tune the JVM settings for the replication process
on its own
* from what I see of the current patch (although I'm sure this could be changed) is that it
looks like there's only a single ReplicationConsumer, meaning that it wouldn't be possible
to have "real" replication running next to a custom ReplicationSource

The tighter-coupling issues that I outlined above could be worked around by having a lightweight
ReplicationConsumer that just passes data to an external process via RPC, but that's basically
exactly what the replication code in HBase is doing right now, so it would be a shame to re-implement
something like that.

> Define Replication Interface
> ----------------------------
>
>                 Key: HBASE-10504
>                 URL: https://issues.apache.org/jira/browse/HBASE-10504
>             Project: HBase
>          Issue Type: Task
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.99.0
>
>         Attachments: hbase-10504_wip1.patch
>
>
> HBase has replication.  Fellas have been hijacking the replication apis to do all kinds
of perverse stuff like indexing hbase content (hbase-indexer https://github.com/NGDATA/hbase-indexer)
and our [~toffer] just showed up w/ overrides that replicate via an alternate channel (over
a secure thrift channel between dcs over on HBASE-9360).  This issue is about surfacing these
APIs as public with guarantees to downstreamers similar to those we have on our public client-facing
APIs (and so we don't break them for downstreamers).
> Any input [~phunt] or [~gabriel.reid] or [~toffer]?
> Thanks.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message