hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10504) Define Replication Interface
Date Mon, 07 Apr 2014 22:48:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962351#comment-13962351
] 

Jean-Daniel Cryans commented on HBASE-10504:
--------------------------------------------

It does seem that people want to be able to listen to row operations in HBase. I'm not sure
how we can fully support this use case if bulk loads and edits that aren't hitting the WALs
are mixed in. The current contract is that we replicate everything that's sent to the WAL
and that got sync'd.

Regarding the actual interfaces, here's how I see it (I'm sure I'm missing a few things):

h5. Replication source

h6. Filtering WALEdits
We need to formalize what {{ReplicationSource#removeNonReplicableEdits}} is currently doing,
maybe it could be done as a chain à la {{FileCleanerDelegate}}.

h6. Replication management
Currently, enable/disabling/adding/removing peers is all done via ZK which we're trying to
not use as a permanent data store. This jira goes into more details [HBASE-10295|https://issues.apache.org/jira/browse/HBASE-10295].
If the master is going to be in charge of it, then it means we need to define a new protobuf
service that the RS will implement. It should be separate from AdminProtos. 

h5. Replication sink

h6. Re-creating a region server
Replication is currently done via our RPC mechanism, so you need to start {{RpcServer}} in
order to receive requests. The the next part of the contract that replication relies on is
that the sinks are discoverable via ZooKeeper, basically piggybacking on the RS discovery
process. This means setting up a {{ZooKeeperWatcher}}, crafting a server name and then creating
the znode. A good example of this can be found in SEP: https://github.com/NGDATA/hbase-indexer/blob/master/hbase-sep/hbase-sep-impl-0.95/src/main/java/com/ngdata/sep/impl/SepConsumer.java

It may not seem as whole lot of code but it's code that can easily be broken with a few signature
changes since those interfaces aren't clearly marked.

h6. ReplicateWALEntry
{{ReplicateWALEntry}} is a service offered as part of AdminProtos so it needs to move out.
It should be a separate service from the previous one I described in "Replication management".
The unfortunate thing here is that {{Replay}} relies on the same messages: https://github.com/apache/hbase/blob/trunk/hbase-protocol/src/main/protobuf/Admin.proto#L260.
To extract {{ReplicateWALEntry}} in a compatible way we'll have to deprecate it and maybe
also deprecate {{Replay}}'s current signature to give it its own appropriately-named messages
(or not, not a big deal).

> Define Replication Interface
> ----------------------------
>
>                 Key: HBASE-10504
>                 URL: https://issues.apache.org/jira/browse/HBASE-10504
>             Project: HBase
>          Issue Type: Task
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.99.0
>
>
> HBase has replication.  Fellas have been hijacking the replication apis to do all kinds
of perverse stuff like indexing hbase content (hbase-indexer https://github.com/NGDATA/hbase-indexer)
and our [~toffer] just showed up w/ overrides that replicate via an alternate channel (over
a secure thrift channel between dcs over on HBASE-9360).  This issue is about surfacing these
APIs as public with guarantees to downstreamers similar to those we have on our public client-facing
APIs (and so we don't break them for downstreamers).
> Any input [~phunt] or [~gabriel.reid] or [~toffer]?
> Thanks.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message