hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Sechrist (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3686) Scanner timeout on RegionServer but Client won't know what happened
Date Wed, 23 Mar 2011 15:16:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010169#comment-13010169

Sean Sechrist commented on HBASE-3686:

It was an old customization that had been reverted on the server but not in the client's config.

It would be hard to return a better message in the UnknownScannerException, because the region
server doesn't know whether it was a scanner whose lease expired or if it is a genuine unknown

So, I think the scan telling the server the lease period does seem like the best bet.

> Scanner timeout on RegionServer but Client won't know what happened
> -------------------------------------------------------------------
>                 Key: HBASE-3686
>                 URL: https://issues.apache.org/jira/browse/HBASE-3686
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.89.20100924
>            Reporter: Sean Sechrist
>            Priority: Minor
> This can cause rows to be lost from a scan.
> See this thread where the issue was brought up: http://search-hadoop.com/m/xITBQ136xGJ1
> If hbase.regionserver.lease.period is higher on the client than the server we can get
this series of events: 
> 1. Client is scanning along happily, and does something slow.
> 2. Scanner times out on region server
> 3. Client calls HTable.ClientScanner.next()
> 4. The region server throws an UnknownScannerException
> 5. Client catches exception and sees that it's not longer then it's hbase.regionserver.lease.period
config, so it doesn't throw a ScannerTimeoutException. Instead, it treats it like a NSRE.
> Right now the workaround is to make sure the configs are consistent. 
> A possible fix would be to use whatever the region server's scanner timeout is, rather
than the local one.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message