hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-10701) Cache invalidation improvements from client side
Date Tue, 11 Mar 2014 19:03:49 GMT

     [ https://issues.apache.org/jira/browse/HBASE-10701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Enis Soztutar updated HBASE-10701:

    Attachment: hbase-10701_v2.patch

Attaching a secondary patch, which fixes three interrelated issues. Fortunately, with this
patch, the test HBASE-10572 is able to run on an 8 node cluster for 100min with CM. 

The changes include: 
 # Individual RPC's for replicas can receive exceptions (RegionMovedException, etc) and also
connection exceptions. Now the cache invalidation is done so that only the cache entry for
the replica location will be cleared instead of the whole cached meta row.
 # When a server is killed, it's locations are removed from the cache. But after some time,
only the primary region info will be left in the cache, and unless we go and look at the meta
again, we won't know about the region replicas. So no secondary RPC's will be done unless
the primary RPC timesout. I fixed it so that individual locations in RegionLocations are not
set to null, instead individual HRL.serverName's are set to null. This enables the RPC layer
to know about the replicas, but the locations might still be null which will trigger a meta
lookup. There are still some failures in the AP code path that I am investigating.  
 # RpcRetryingCallerWithReadReplicas used to schedule the RPC's to primary and secondaries,
and wait for the first result regardless of whether it is an exception or success. In case
of a close connection, one of the RPC's will immediately return with an DoNotRetryEx, and
will fail the whole get() operation, although we should be able to read from the other replicas
perfectly fine. I changed the code path so that it waits for the first successful operation,
a cancellation or interrupt, or for all operations to fail with DoNotRetryEx or RetriesExhaustedEx.

[~nkeywal] could you please take a close look? 

> Cache invalidation improvements from client side
> ------------------------------------------------
>                 Key: HBASE-10701
>                 URL: https://issues.apache.org/jira/browse/HBASE-10701
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: hbase-10070
>         Attachments: hbase-10701_v1.patch, hbase-10701_v2.patch
> Running the integration test in HBASE-10572, and HBASE-10355, it seems that we need some
changes for cache invalidation of meta entries from the client side in backup RPCs. 
> Mainly the RPC's made for replicas should not invalidate the cache for all the replicas
(for example on RegionMovedException, connection error etc). 

This message was sent by Atlassian JIRA

View raw message