hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Kennedy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3524) NPE from CompactionChecker
Date Fri, 11 Feb 2011 19:34:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993653#comment-12993653

James Kennedy commented on HBASE-3524:

So that .meta file with DATA LOSS is definitely old (2010-05-20).
Looking back over old logs i realized that DATA LOSS WARN has been there for a while.
So probably that is a separate issue from this CompactionChecker problem.
Guess I'll just delete the file in HDFS.

So, it looks like my data is stable now after the forced compactions. I didn't have to apply
the patch in production code to stop the NPEs.

I'm still concerned about how this happened to some regions and not others since all were
left up long enough to get to that NPE point which only prevented the first post-0.90.0 upgrade
full compactions for 8 out of 50 tables. Maybe the other 42 were updated as part of the initial
startup process...

> NPE from CompactionChecker
> --------------------------
>                 Key: HBASE-3524
>                 URL: https://issues.apache.org/jira/browse/HBASE-3524
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.0
>            Reporter: James Kennedy
>            Assignee: James Kennedy
>            Priority: Blocker
>             Fix For: 0.90.1, 0.90.2
> I recently updated production data to use HBase 0.90.0.
> Now I'm periodically seeing:
> [10/02/11 17:23:27] 30076066 [mpactionChecker] ERROR nServer$MajorCompactionChecker 
- Caught exception
> java.lang.NullPointerException
> 	at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:832)
> 	at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:810)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:2800)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:1047)
> 	at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> The only negative effect is that this is interrupting compactions from happening. But
that is pretty serious and this might be a sign of data corruption?
> Maybe it's just my data, but this task should at least involve improving the handling
to catch the NPE and still iterate through the other onlineRegions that might compact without
error.  The MajorCompactionChecker.chore() method only catches IOExceptions and so this NPE
breaks out of that loop. 

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message