hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4058) Extend TestHBaseFsck with a complete .META. recovery scenario
Date Fri, 08 Jul 2011 21:59:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062215#comment-13062215

stack commented on HBASE-4058:

Dan Harvey who is still on 0.20.x had a similar issue this month.  He added four new servers
to his cluster.  These new servers were not resolving properly.  What we were seeing is that
on startup, I believe, these new servers would be assigned their portion of the regions on
checkin.  Then, the basescanner would run -- its 0.20.x hbase -- and it would not recognize
the address the new servers were writing .META. and it would then think the regions unassigned
and would assign them elsewhere.  So, we have double-assignment and at same time there was
splitting and compactions running.  His .META. had holes and overlaps.

In his case, not all tables were honked.  Just the big ones.  I wonder if an improved add_table.rb
would work in this case; i.e. do the same rewrite of the .META. content for a single table
based off the content in the filesystem rather than trying fix up on .META. table.

Let me try adding add_table.rb to hbck.  Let me add option of running per table and then a
global, restore all tables.

Dan sent me the .META. dir content.  It looks like this:

-rw-r--r--@ 1 Stack  staff         0 Jul  7 08:26 281906331022358506
-rw-r--r--@ 1 Stack  staff  94283152 Jul  7 08:26 5233066973300534672
-rw-r--r--@ 1 Stack  staff         0 Jul  7 08:26 6803125877105432645
-rw-r--r--@ 1 Stack  staff         0 Jul  7 08:26 8650632001596730954

i.e. three zero-length files.  I wonder how these were written (I asked him for a dir listing
from actual cluster).

> Extend TestHBaseFsck with a complete .META. recovery scenario
> -------------------------------------------------------------
>                 Key: HBASE-4058
>                 URL: https://issues.apache.org/jira/browse/HBASE-4058
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>             Fix For: 0.92.0
> We should have a unit test that launches a minicluster and constructs a few tables, then
deletes META files on disk, then bounces the master, then recovers the result with HBCK. Perhaps
it is possible to extend TestHBaseFsck to do this.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message