hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-236) [hbase] NPE on failed open of region
Date Mon, 04 Feb 2008 19:31:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565487#action_12565487
] 

Jim Kellerman commented on HBASE-236:
-------------------------------------

I think this could still happen if a region server crashes in the middle of a cache flush
(which could leave a zero length MapFile or info file.

We could put some defensive measures in HStore to prevent the problem, and if HLog does not
guard against zero length files it should as well.

Finally, hbase-fsck (or whatever we call it) should check for zero length files.

> [hbase] NPE on failed open of region
> ------------------------------------
>
>                 Key: HBASE-236
>                 URL: https://issues.apache.org/jira/browse/HBASE-236
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> From Bryan Duxbury supplied log:
> {code}
>    1044 2007-12-15 04:37:56,052 INFO org.apache.hadoop.hbase.HRegionServer: MSG_REGION_OPEN
: spider_pages,7_202623541,1197662034823
>    1045 2007-12-15 04:37:56,060 ERROR org.apache.hadoop.hbase.HRegionServer: error opening
region spider_pages,7_202623541,1197662034823
>    1046 java.io.EOFException
>    1047     at java.io.DataInputStream.readByte(DataInputStream.java:250)
>    1048     at org.apache.hadoop.hbase.HStoreFile.loadInfo(HStoreFile.java:594)
>    1049     at org.apache.hadoop.hbase.HStore.<init>(HStore.java:613)
>    1050     at org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:287)
>    1051     at org.apache.hadoop.hbase.HRegionServer.openRegion(HRegionServer.java:1182)
>    1052     at org.apache.hadoop.hbase.HRegionServer$Worker.run(HRegionServer.java:1133)
>    1053     at java.lang.Thread.run(Thread.java:619)
>    1054 2007-12-15 04:37:56,061 FATAL org.apache.hadoop.hbase.HRegionServer: Unhandled
exception                                                                                
                                                                                         
                                                    1055 java.lang.NullPointerException
>    1056     at org.apache.hadoop.hbase.HRegionServer.reportClose(HRegionServer.java:1066)
>    1057     at org.apache.hadoop.hbase.HRegionServer.openRegion(HRegionServer.java:1188)
>    1058     at org.apache.hadoop.hbase.HRegionServer$Worker.run(HRegionServer.java:1133)
>    1059     at java.lang.Thread.run(Thread.java:619)
>    1060 2007-12-15 04:37:56,061 INFO org.apache.hadoop.hbase.HRegionServer: worker thread
exiting
> {code}
> I see same exception when we try to deploy same region on another server; the info file
must be horked (Seems like something we could recover from reading through looking for highest
sequence number; would be expensive but alternative is lost region).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message