Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7FE81200D14 for ; Tue, 19 Sep 2017 08:12:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7E6671609DE; Tue, 19 Sep 2017 06:12:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9CDA91609DB for ; Tue, 19 Sep 2017 08:12:06 +0200 (CEST) Received: (qmail 37453 invoked by uid 500); 19 Sep 2017 06:12:05 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 37431 invoked by uid 99); 19 Sep 2017 06:12:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Sep 2017 06:12:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 2BD9FC150A for ; Tue, 19 Sep 2017 06:12:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id tFfQOBWfOQEI for ; Tue, 19 Sep 2017 06:12:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8B77F5FCDA for ; Tue, 19 Sep 2017 06:12:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id AA5A2E010F for ; Tue, 19 Sep 2017 06:12:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 3CC9124147 for ; Tue, 19 Sep 2017 06:12:00 +0000 (UTC) Date: Tue, 19 Sep 2017 06:12:00 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-18846) Accommodate the hbase-indexer/lily/SEP consumer deploy-type MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 19 Sep 2017 06:12:07 -0000 [ https://issues.apache.org/jira/browse/HBASE-18846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-18846: -------------------------- Description: This is a follow-on from HBASE-10504, Define a Replication Interface. There we defined a new, flexible replication endpoint for others to implement but it did little to help the case of the lily hbase-indexer. This issue takes up the case of the hbase-indexer. The hbase-indexer poses to hbase as a 'fake' peer cluster (For why hbase-indexer is implemented so, the advantage to having the indexing done in a separate process set that can be independently scaled, can participate in the same security realm, etc., see discussion in HBASE-10504). The hbase-indexer will start up a cut-down "RegionServer" processes that are just an instance of hbase RpcServer hosting an AdminProtos Service. They make themselves 'appear' to the Replication Source by hoisting up an ephemeral znode 'registering' as a RegionServer. The source cluster then streams WALEdits to the Admin Protos method: {code} public ReplicateWALEntryResponse replicateWALEntry(final RpcController controller, final ReplicateWALEntryRequest request) throws ServiceException { {code} The hbase-indexer relies on other hbase internals like Server so it can get a ZooKeeperWatcher instance and know the 'name' to use for this cut-down server. Thoughts on how to proceed include: * Better formalize its current digestion of hbase internals; make it so rpcserver is allowed to be used by others, etc. This would be hard to do given they use basics like Server, Protobuf serdes for WAL types, and AdminProtos Service. Any change in this wide API breaks (again) hbase-indexer. We have made a 'channel' for Coprocessor Endpoints so they continue to work though they use 'internal' types. They can use protos in hbase-protocol. hbase-protocol protos are in a limbo currently where they are sort-of 'public'; a TODO. Perhaps the hbase-indexer could do similar relying on the hbase-protocol (pb2.5) content and we could do something to reveal rpcserver and zk for hbase-indexer safe use. * Start an actual RegionServer only have it register the AdminProtos Service only -- not ClientProtos and the Service that does Master interaction, etc. [I checked, this is not as easy to do as I at first thought -- St.Ack] Then have the hbase-indexer implement an AdminCoprocessor to override the replicateWALEntry method (the Admin CP implementation may need work). This would narrow the hbase-indexer exposure to that of the Admin Coprocessor Interface * Over in HBASE-10504, [~enis] suggested "... if we want to provide isolation for the replication services in hbase, we can have a simple host as another daemon which hosts the ReplicationEndpoint implementation. RS's will use a built-in RE to send the edits to this layer, and the host will delegate it to the RE implementation. The flow would be something like: RS --> RE inside RS --> Host daemon for RE --> Actual RE implementation --> third party system..." Other crazy notions occur including the setup of an Admin Interface Coprocessor Endpoint. A new ReplicationEndpoint would feed the replication stream to the remote cluster via the CPEP registered channel. But time is short. Hopefully we can figure something that will work in 2.0 timeframe w/o too much code movement. was: This is a follow-on from HBASE-10504, Define a Replication Interface. There we defined a new, flexible replication endpoint for others to implement but it did little to help the case of the lily hbase-indexer. This issue takes up the case of the hbase-indexer. The hbase-indexer poses to hbase as a 'fake' peer cluster (For why hbase-indexer is implemented so, the advantage to having the indexing done in a separate process set that can be independently scaled, can participate in the same security realm, etc., see discussion in HBASE-10504). The hbase-indexer will start up a cut-down "RegionServer" processes that are just an instance of hbase RpcServer hosting an AdminProtos Service. They make themselves 'appear' to the Replication Source by hoisting up an ephemeral znode 'registering' as a RegionServer. The source cluster then streams WALEdits to the Admin Protos method: {code} public ReplicateWALEntryResponse replicateWALEntry(final RpcController controller, final ReplicateWALEntryRequest request) throws ServiceException { {code} The hbase-indexer relies on other hbase internals like Server so it can get a ZooKeeperWatcher instance and know the 'name' to use for this cut-down server. Thoughts on how to proceed include: * Better formalize its current digestion of hbase internals; make it so rpcserver is allowed to be used by others, etc. This would be hard to do given they use basics like Server, Protobuf serdes for WAL types, and AdminProtos Service. Any change in this wide API breaks (again) hbase-indexer. We have made a 'channel' for Coprocessor Endpoints so they continue to work though they use 'internal' types. They can use protos in hbase-protocol. hbase-protocol protos are in a limbo currently where they are sort-of 'public'; a TODO. Perhaps the hbase-indexer could do similar relying on the hbase-protocol (pb2.5) content and we could do something to reveal rpcserver and zk for hbase-indexer safe use. * Start an actual RegionServer only have it register the AdminProtos Service only -- not ClientProtos and the Service that does Master interaction, etc. Then have the hbase-indexer implement an AdminCoprocessor to override the replicateWALEntry method (the Admin CP implementation may need work). This would narrow the hbase-indexer exposure to that of the Admin Coprocessor Interface Other crazy notions occur including the setup of an Admin Interface Coprocessor Endpoint. A new ReplicationEndpoint would feed the replication stream to the remote cluster via the CPEP registered channel. But time is short. Hopefully we can figure something that will work in 2.0 timeframe w/o too much code movement. > Accommodate the hbase-indexer/lily/SEP consumer deploy-type > ----------------------------------------------------------- > > Key: HBASE-18846 > URL: https://issues.apache.org/jira/browse/HBASE-18846 > Project: HBase > Issue Type: Bug > Reporter: stack > > This is a follow-on from HBASE-10504, Define a Replication Interface. There we defined a new, flexible replication endpoint for others to implement but it did little to help the case of the lily hbase-indexer. This issue takes up the case of the hbase-indexer. > The hbase-indexer poses to hbase as a 'fake' peer cluster (For why hbase-indexer is implemented so, the advantage to having the indexing done in a separate process set that can be independently scaled, can participate in the same security realm, etc., see discussion in HBASE-10504). The hbase-indexer will start up a cut-down "RegionServer" processes that are just an instance of hbase RpcServer hosting an AdminProtos Service. They make themselves 'appear' to the Replication Source by hoisting up an ephemeral znode 'registering' as a RegionServer. The source cluster then streams WALEdits to the Admin Protos method: > {code} > public ReplicateWALEntryResponse replicateWALEntry(final RpcController controller, > final ReplicateWALEntryRequest request) throws ServiceException { > {code} > The hbase-indexer relies on other hbase internals like Server so it can get a ZooKeeperWatcher instance and know the 'name' to use for this cut-down server. > Thoughts on how to proceed include: > > * Better formalize its current digestion of hbase internals; make it so rpcserver is allowed to be used by others, etc. This would be hard to do given they use basics like Server, Protobuf serdes for WAL types, and AdminProtos Service. Any change in this wide API breaks (again) hbase-indexer. We have made a 'channel' for Coprocessor Endpoints so they continue to work though they use 'internal' types. They can use protos in hbase-protocol. hbase-protocol protos are in a limbo currently where they are sort-of 'public'; a TODO. Perhaps the hbase-indexer could do similar relying on the hbase-protocol (pb2.5) content and we could do something to reveal rpcserver and zk for hbase-indexer safe use. > * Start an actual RegionServer only have it register the AdminProtos Service only -- not ClientProtos and the Service that does Master interaction, etc. [I checked, this is not as easy to do as I at first thought -- St.Ack] Then have the hbase-indexer implement an AdminCoprocessor to override the replicateWALEntry method (the Admin CP implementation may need work). This would narrow the hbase-indexer exposure to that of the Admin Coprocessor Interface > * Over in HBASE-10504, [~enis] suggested "... if we want to provide isolation for the replication services in hbase, we can have a simple host as another daemon which hosts the ReplicationEndpoint implementation. RS's will use a built-in RE to send the edits to this layer, and the host will delegate it to the RE implementation. The flow would be something like: RS --> RE inside RS --> Host daemon for RE --> Actual RE implementation --> third party system..." > > Other crazy notions occur including the setup of an Admin Interface Coprocessor Endpoint. A new ReplicationEndpoint would feed the replication stream to the remote cluster via the CPEP registered channel. > But time is short. Hopefully we can figure something that will work in 2.0 timeframe w/o too much code movement. -- This message was sent by Atlassian JIRA (v6.4.14#64029)