hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Kennedy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3524) NPE from CompactionChecker
Date Fri, 11 Feb 2011 04:45:57 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993341#comment-12993341

James Kennedy commented on HBASE-3524:

This patch obviously stops the npe and allows compaction checking to follow through.

Furthermore I added a log output line that indicates when/what stores have .timeRangeTracker
== null when encountered.  It seemed that 7 or 8 tables (out of 50) had this problem and when
i forced their major compaction from the hbase shell they stopped reporting the error.

So it looks like the major compactions created new stores with timeRangeTracker properly.

I'm still concerned though about how this happened in the first place and I need to do some
thorough testing of the data to ensure nothing was lost.

Ryan, in your opinion do you think this data is likely to have survived corruption?

And thanks for your speedy help.

> NPE from CompactionChecker
> --------------------------
>                 Key: HBASE-3524
>                 URL: https://issues.apache.org/jira/browse/HBASE-3524
>             Project: HBase
>          Issue Type: Bug
>            Reporter: James Kennedy
>             Fix For: 0.90.2
> I recently updated production data to use HBase 0.90.0.
> Now I'm periodically seeing:
> [10/02/11 17:23:27] 30076066 [mpactionChecker] ERROR nServer$MajorCompactionChecker 
- Caught exception
> java.lang.NullPointerException
> 	at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:832)
> 	at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:810)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:2800)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:1047)
> 	at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> The only negative effect is that this is interrupting compactions from happening. But
that is pretty serious and this might be a sign of data corruption?
> Maybe it's just my data, but this task should at least involve improving the handling
to catch the NPE and still iterate through the other onlineRegions that might compact without
error.  The MajorCompactionChecker.chore() method only catches IOExceptions and so this NPE
breaks out of that loop. 

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message