Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 15 May 2017 19:22:04 +0000 (UTC)
From: "huaxiang sun (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.13069601.1494023258000.214170.1494876124359@Atlassian.JIRA>
In-Reply-To: <JIRA.13069601.1494023258000@Atlassian.JIRA>
References: <JIRA.13069601.1494023258000@Atlassian.JIRA> <JIRA.13069601.1494023258661@jira-lw-us.apache.org>
Subject: [jira] [Updated] (HBASE-18005) read replica: handle the case that
 region server hosting both primary replica and meta region is down
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Mon, 15 May 2017 19:22:09 -0000


     [ https://issues.apache.org/jira/browse/HBASE-18005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

huaxiang sun updated HBASE-18005:
---------------------------------
    Attachment: HBASE-18005-master-002.patch

v2 patch addresses an error in unitest case. The new unittest case tries to use meta replica and run into HBASE-18035, so meta replica usage is temporarily disabled.

> read replica: handle the case that region server hosting both primary replica and meta region is down
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-18005
>                 URL: https://issues.apache.org/jira/browse/HBASE-18005
>             Project: HBase
>          Issue Type: Bug
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>         Attachments: HBASE-18005-master-001.patch, HBASE-18005-master-002.patch
>
>
> Identified one corner case in testing  that when the region server hosting both primary replica and the meta region is down, the client tries to reload the primary replica location from meta table, it is supposed to clean up only the cached location for specific replicaId, but it clears caches for all replicas. Please see
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L813
> Since it takes some time for regions to be reassigned (including meta region), the following may throw exception
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L173
> This exception needs to be caught and  it needs to get cached location (in this case, the primary replica's location is not available). If there are cached locations for other replicas, it can still go ahead to get stale values from secondary replicas.
> With meta replica, it still helps to not clean up the caches for all replicas as the info from primary meta replica is up-to-date.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)