hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3205) TableRecordReaderImpl.restart NPEs when first next is restarted
Date Mon, 08 Nov 2010 21:36:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929746#action_12929746
] 

Jean-Daniel Cryans commented on HBASE-3205:
-------------------------------------------

It's a very simple fix, will commit in a few minutes if no objections:

{code}
+      if (lastRow == null) {
+        LOG.warn("We are restarting the first next() invocation," +
+            " if your mapper's restarted a few other times like this" +
+            " then you should consider killing this job and investigate" +
+            " why it's taking so long.");
+        lastRow = scan.getStartRow();
+      }
{code}

> TableRecordReaderImpl.restart NPEs when first next is restarted
> ---------------------------------------------------------------
>
>                 Key: HBASE-3205
>                 URL: https://issues.apache.org/jira/browse/HBASE-3205
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89.20100924
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.90.0
>
>
> We got this pretty interesting NPE out of TableRecordReaderImpl.restart on a job that
was filtering more than 99% of the data from a very huge table with caching set to 10k:
> {noformat}
> 2010-11-08 13:08:22,344 DEBUG org.apache.hadoop.hbase.mapreduce.TableRecordReader:
>  recovered from org.apache.hadoop.hbase.client.ScannerTimeoutException:
>  61521ms passed since the last invocation, timeout is currently set to 60000
> 	at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:956)
> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:132)
> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:142)
> ...
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException:
>  Scanner was closed (timed out?) after we renewed it. Could be caused by a very slow
scanner or a lengthy garbage collection
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:2233)
> 	at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:2260)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1933)
> ...
> 	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:81)
> 	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:37)
> 	at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1138)
> 	at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:942)
> 	... 8 more
> 2010-11-08 13:08:22,347 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
> java.lang.NullPointerException
> 	at org.apache.hadoop.hbase.util.Bytes.toStringBinary(Bytes.java:301)
> 	at org.apache.hadoop.hbase.client.HTable$ClientScanner.<init>(HTable.java:803)
> 	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:484)
> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.restart(TableRecordReaderImpl.java:58)
> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:135)
> ...
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170)
> 2010-11-08 13:08:22,349 INFO org.apache.hadoop.mapred.TaskRunner
> {noformat}
> This is because the last row key we saw is set to null, since we haven't seen any yet
:)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message