hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nkeywal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6461) Killing the HRegionServer and DataNode hosting ROOT can result in a malformed root table.
Date Fri, 27 Jul 2012 13:58:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423883#comment-13423883

nkeywal commented on HBASE-6461:

Thanks a lot!

bq. I thought HBase neither uses nor needs "append" (the reason for initial using the append
branch was that it also had the sync patches).
Yes, the log line is actually confusing, you just need to have someone writing a file and
someone else reading it.

bq.  Is this only a special case for the WAL when we start to replay the edits, when the file
is not closed, yet, because the writer died?
I think so. Even if I wonder if we could not have similar stuff for large store hfiles + crashes,
even if on paper it should be ok.

> Killing the HRegionServer and DataNode hosting ROOT can result in a malformed root table.
> -----------------------------------------------------------------------------------------
>                 Key: HBASE-6461
>                 URL: https://issues.apache.org/jira/browse/HBASE-6461
>             Project: HBase
>          Issue Type: Bug
>         Environment: hadoop-0.20.2-cdh3u3
> HBase 0.94.1 RC1
>            Reporter: Elliott Clark
>            Priority: Critical
>             Fix For: 0.94.2
> Spun up a new dfs on hadoop-0.20.2-cdh3u3
> Started hbase
> started running loadtest tool.
> killed rs and dn holding root with killall -9 java on server sv4r27s44 at about 2012-07-25
> After things stabilize Root is in a bad state. Ran hbck and got:
> Exception in thread "main" org.apache.hadoop.hbase.client.NoServerForRegionException:
No server address listed in -ROOT- for region .META.,,1.1028785192 containing row 
> at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1016)
> at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:841)
> at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:810)
> at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:232)
> at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:172)
> at org.apache.hadoop.hbase.util.HBaseFsck.connect(HBaseFsck.java:241)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3236)
> hbase(main):001:0> scan '-ROOT-'
> ROW                                           COLUMN+CELL                           
> 12/07/25 22:43:18 INFO security.UserGroupInformation: JAAS Configuration already set
up for Hadoop, not re-installing.
>  .META.,,1                                    column=info:regioninfo, timestamp=1343255838525,
value={NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,}
>  .META.,,1                                    column=info:v, timestamp=1343255838525,

> 1 row(s) in 0.5930 seconds
> Here's the master log: https://gist.github.com/3179194
> I tried the same thing with 0.92.1 and I was able to get into a similar situation, so
I don't think this is anything new. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message