hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-2482) regions in transition do not get reassigned by master when RS crashes
Date Mon, 03 May 2010 02:19:56 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-2482:
-------------------------

    Attachment: 2482-unittest.txt

First cut at unit test.  Needs an edit but looks to be working.  Adds a protected 'killl'
to regionserver which simulates RS kill (does no cleanup but shutdown of socket).  Also added
new HMsg called TEST_MSG_BLOCK_RS.  When RS receives this from master, it just waits until
closed, aborted or killed.  It blocks the worker queue.

The way the test works is that it adds a new RS to small cluster, waits on load balancer to
move some regions to new server.  As soon as some are open, we send a close of them all followed
by a TEST_MSG_BLOCK_RS.  The closes go through, balancer assigns the new server some of the
closed regions only the TEST_MSG_BLOCK_RS is in place.

Let me make a patch that includes Todds patch and cleaned up test next.

> regions in transition do not get reassigned by master when RS crashes
> ---------------------------------------------------------------------
>
>                 Key: HBASE-2482
>                 URL: https://issues.apache.org/jira/browse/HBASE-2482
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.20.5
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.20.5, 0.21.0
>
>         Attachments: 2482-unittest.txt, hbase-2482.txt
>
>
> Very similar to HBASE-1928, but for the general case (not just ROOT/META):
> If a region is in transition on a RS when the RS crashes, the master does not remove
it from regionsInTransition when processing the RS shutdown. This is fairly easy to trigger
by bringing up a RS and kill -9ing it just as it starts to get regions assigned. Those regions
will get permanently lost since they're stuck in regionsInTransition and thus don't get assigned
by the metascanner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message