hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Recovery failure during single Get()
Date Sat, 06 Apr 2013 21:05:21 GMT
Thanks for the analysis.

I left some comment on HBASE-8285

On Sat, Apr 6, 2013 at 1:36 PM, Varun Sharma <varun@pinterest.com> wrote:

> Hi,
> We are observing this bug for a while when we use HTable.get() operation to
> do a single Get call using the "Result get(Get get)" API and I thought its
> best to bring it up.
> Steps to reproduce this bug:
> 1) Gracefull restart a region server causing regions to get redistributed.
> 2) Client call to this region keeps failing since Meta Cache is never
> purged on the client for the region that moved.
> Reason behind the bug:
> 1) Client continues to hit the old region server.
> 2) The old region server throws NotServingRegionException which is not
> handled correctly and the META cache entries are never purged for that
> server causing the client to keep hitting the old server.
> The reason lies in ServerCallable code since we only purge META cache
> entries when there is a RetriesExhaustedException, SocketTimeoutException
> or ConnectException. However, there is no case check for
> NotServingRegionException(s).
> Why is this not a problem for Scan(s) and Put(s) ?
> a) If a region server is not hosting a region/scanner, then an
> UnknownScannerException is thrown which causes a relocateRegion() call
> causing a refresh of the META cache for that particular region.
> b) For put(s), the processBatchCallback() interface in HConnectionManager
> is used which clears out META cache entries for all kinds of exceptions
> except DoNotRetryException.
> Created HBASE 8285 for this.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message