Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 25E98200B33 for ; Wed, 15 Jun 2016 00:30:32 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 24638160A62; Tue, 14 Jun 2016 22:30:32 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6C528160A06 for ; Wed, 15 Jun 2016 00:30:31 +0200 (CEST) Received: (qmail 55280 invoked by uid 500); 14 Jun 2016 22:30:30 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 55263 invoked by uid 99); 14 Jun 2016 22:30:30 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Jun 2016 22:30:30 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 31F322C1F60 for ; Tue, 14 Jun 2016 22:30:30 +0000 (UTC) Date: Tue, 14 Jun 2016 22:30:30 +0000 (UTC) From: "Joseph (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-15974) Create a ReplicationQueuesClientHBaseImpl MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 14 Jun 2016 22:30:32 -0000 [ https://issues.apache.org/jira/browse/HBASE-15974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330774#comment-15330774 ] Joseph commented on HBASE-15974: -------------------------------- Hello Vincent, Yes, I think that this will add an extra dependency on the RS's hosting the Replication Table. During ReplicationSourceManager.init(), when we try to claim orphaned queues, ReplicationQueuesHBaseImpl will try to run a Scanner over the entire Replication Table to locate the orphaned queues. If a RS hosting a region of the Replication Table happens to be down during this time the Scanner operation will fail. We try to combat this situation by setting an extremely long retry period for Replication Table operations: HBASE-15937. As of now operations on the Replication Table have a retry time of 2 hours. If the region is still unavailable after this time, the RS making the request will abort. We are hoping that cluster startup does not take longer than 2 hours. The patch at HBASE-14190 should also help make sure that the Replication Table's initialization is prioritized during cluster startup. > Create a ReplicationQueuesClientHBaseImpl > ----------------------------------------- > > Key: HBASE-15974 > URL: https://issues.apache.org/jira/browse/HBASE-15974 > Project: HBase > Issue Type: Sub-task > Components: Replication > Reporter: Joseph > Assignee: Joseph > Attachments: HBASE-15974.patch > > > Currently ReplicationQueuesClient utilizes a ZooKeeper implementation ReplicationQueuesClientZkImpl that attempts to read from the ZNode where ReplicationQueuesZkImpl tracked WAL's. So we need to create a HBase implementation for ReplicationQueuesClient. > The review is posted at https://reviews.apache.org/r/48521/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)