hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18005) read replica: handle the case that region server hosting both primary replica and meta region is down
Date Thu, 11 May 2017 02:24:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005805#comment-16005805
] 

Lei Chen commented on HBASE-18005:
----------------------------------

Hi Huaxiang,
I'm running into this issue as well. Good work!

May I ask which version of hbase are you using? 
The reason I have the question is because I'm using HBase 1.1.2 which doesn't have the fix
for an known meta table replication issue (https://issues.apache.org/jira/browse/HBASE-17238)
With HBase-17238 in place, setting hbase.meta.replica.count to a number greater than 1 should
be able to handle the case where the primary regions of a normal table and the meta table
are both down.

I'm curious if you are in the same situation as me that cannot have hbase.meta.replica.count
set to, say 3? (https://hbase.apache.org/book.html#_server_side_properties)


> read replica: handle the case that region server hosting both primary replica and meta
region is down
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-18005
>                 URL: https://issues.apache.org/jira/browse/HBASE-18005
>             Project: HBase
>          Issue Type: Bug
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>         Attachments: HBASE-18005-master-001.patch
>
>
> Identified one corner case in testing  that when the region server hosting both primary
replica and the meta region is down, the client tries to reload the primary replica location
from meta table, it is supposed to clean up only the cached location for specific replicaId,
but it clears caches for all replicas. Please see
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L813
> Since it takes some time for regions to be reassigned (including meta region), the following
may throw exception
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L173
> This exception needs to be caught and  it needs to get cached location (in this case,
the primary replica's location is not available). If there are cached locations for other
replicas, it can still go ahead to get stale values from secondary replicas.
> With meta replica, it still helps to not clean up the caches for all replicas as the
info from primary meta replica is up-to-date.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message