hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4739) Master dying while going to close a region can leave it in transition forever
Date Tue, 15 Nov 2011 06:53:51 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150265#comment-13150265
] 

ramkrishna.s.vasudevan commented on HBASE-4739:
-----------------------------------------------

@Gao
I have 1 thing to say
->Why don't we issue the close call to RS in timeout monitor for CLOSING rather than directly
doing it in processRegionInTransition.  This will give us sometime to really see if the RS
has not got the call from Master or the RS was really slow in processing the close call?
What do you say Gao? 
@Ted
Pls suggest.


                
> Master dying while going to close a region can leave it in transition forever
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4739
>                 URL: https://issues.apache.org/jira/browse/HBASE-4739
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.4
>            Reporter: Jean-Daniel Cryans
>            Assignee: gaojinchao
>            Priority: Minor
>             Fix For: 0.92.0, 0.94.0, 0.90.5
>
>         Attachments: HBASE-4739_Trunk.patch, HBASE-4739_Trunk_V2.patch
>
>
> I saw this in the aftermath of HBASE-4729 on a 0.92 refreshed yesterday, when the master
died it had just created the RIT znode for a region but didn't tell the RS to close it yet.
> When the master restarted it saw the znode and started printing this:
> {quote}
> 2011-11-03 00:02:49,130 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions
in transition timed out:  TestTable,0007560564,1320253568406.f76899564cabe7e9857c3aeb526ec9dc.
state=CLOSING, ts=1320253605285, server=sv4r11s38,62003,1320195046948
> 2011-11-03 00:02:49,130 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region
has been CLOSING for too long, this should eventually complete or the server will expire,
doing nothing
> {quote}
> It's never going to happen, and it's blocking balancing.
> I'm marking this as minor since I believe this situation is pretty rare unless you hit
other bugs while trying out stuff to root bugs out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message