hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2481) Client is not getting UnknownScannerExceptions; they are being eaten
Date Fri, 23 Apr 2010 23:28:51 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860417#action_12860417
] 

Jean-Daniel Cryans commented on HBASE-2481:
-------------------------------------------

This was caused by HBASE-1671, this changed in ScannerCallable: 

{code} 
   public Result [] call() throws IOException { 
     if (scannerId != -1L && closed) { 
- server.close(scannerId); 
- scannerId = -1L; 
+ close(); 
     } else if (scannerId == -1L && !closed) { 
- // open the scanner 
- scannerId = openScanner(); 
+ this.scannerId = openScanner(); 
     } else { 
- Result [] rrs = server.next(scannerId, caching); 
+ Result [] rrs = null; 
+ try { 
+ rrs = server.next(scannerId, caching); 
+ } catch (IOException e) { 
+ IOException ioe = null; 
+ if (e instanceof RemoteException) { 
+ ioe = RemoteExceptionHandler.decodeRemoteException((RemoteException)e); 
+ } 
+ if (ioe != null && ioe instanceof NotServingRegionException) { 
+ // Throw a DNRE so that we break out of cycle of calling NSRE 
+ // when what we need is to open scanner against new location. 
+ // Attach NSRE to signal client that it needs to resetup scanner. 
+ throw new DoNotRetryIOException("Reset scanner", ioe); 
+ } 
+ } 
       return rrs == null || rrs.length == 0? null: rrs; 
     } 
      
{code} 

We now eat the exception if it's not NSRE, throwing it if the exception is a DoNotRetryIOException
is the right thing to do, but the client code is still broken. In HTable.ClientScanner.next:


{code} 
try { 
            // Server returns a null values if scanning is to stop. Else, 
            // returns an empty array if scanning is to go on and we've just 
            // exhausted current region. 
            values = getConnection().getRegionServerWithRetries(callable); 
            if (skipFirst) { 
              skipFirst = false; 
              // Reget. 
              values = getConnection().getRegionServerWithRetries(callable); 
            } 
          } catch (DoNotRetryIOException e) { 
            Throwable cause = e.getCause(); 
            if (cause == null || !(cause instanceof NotServingRegionException)) { 
              throw e; 
            } 
            // Else, its signal from depths of ScannerCallable that we got an 
            // NSRE on a next and that we need to reset the scanner. 
            if (this.lastResult != null) { 
              this.scan.setStartRow(this.lastResult.getRow()); 
              // Skip first row returned. We already let it out on previous 
              // invocation. 
              skipFirst = true; 
            } 
            // Clear region 
            this.currentRegion = null; 
            continue; 
          } catch (IOException e) { 
            if (e instanceof UnknownScannerException && 
                lastNext + scannerTimeout < System.currentTimeMillis()) { 
              ScannerTimeoutException ex = new ScannerTimeoutException(); 
              ex.initCause(e); 
              throw ex; 
            } 
            throw e; 
          } 
{code} 

We catch the DoNotRetryIOException first and in the other catch clause we check for UnknownScannerException,
which extends DoNotRetryIOException... so ScannerTimeoutException is never used! Easy fix.

> Client is not getting UnknownScannerExceptions; they are being eaten
> --------------------------------------------------------------------
>
>                 Key: HBASE-2481
>                 URL: https://issues.apache.org/jira/browse/HBASE-2481
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.4
>            Reporter: stack
>            Priority: Blocker
>
> This was reported by mudphone on IRC and confirmed by myself in quick test.  If the client
takes too long going back to the RS, the RS will throw an UnknownScannerException but it doesn't
get back to the client.  Instead, the client scan silently ends.  Marking this blocker.  Its
actually in 0.20.4.  Thats what I was testing.  Mayhaps an RC sinker?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message