hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7504) -ROOT- may be offline forever after FullGC of RS
Date Wed, 09 Jan 2013 05:38:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547660#comment-13547660

chunhui shen commented on HBASE-7504:

bq.if root is assigned in other live RS 
It is not a normal case. For common cases, we will assign ROOT in ServerShutdownHandler#verifyAndAssignRoot,
it means we will execute the first if block.
In other way, server.getCatalogTracker().getRootLocation() is only reading data from ZK, I
think it's acceptable
> -ROOT- may be offline forever after FullGC of  RS
> -------------------------------------------------
>                 Key: HBASE-7504
>                 URL: https://issues.apache.org/jira/browse/HBASE-7504
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.3
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>         Attachments: 7504-trunk v1.patch, 7504-trunk v2.patch
> 1.FullGC happen on ROOT regionserver.
> 2.ZK session timeout, master expire the regionserver and submit to ServerShutdownHandler
> 3.Regionserver complete the FullGC
> 4.In the process of ServerShutdownHandler, verifyRootRegionLocation returns true
> 5.ServerShutdownHandler skip assigning ROOT region
> 6.Regionserver abort itself because it reveive YouAreDeadException after a regionserver
> 7.ROOT is offline now, and won't be assigned any more unless we restart master
> Master Log:
> {code}
> 2012-10-31 19:51:39,043 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw88.kgb.sqa.cm4,60020,1351671478752
to dead servers, submitted shutdown handler to be executed, root=true, meta=false
> 2012-10-31 19:51:39,045 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
Splitting logs for dw88.kgb.sqa.cm4,60020,1351671478752
> 2012-10-31 19:51:50,113 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
Server dw88.kgb.sqa.cm4,60020,1351671478752 was carrying ROOT. Trying to assign.
> 2012-10-31 19:52:15,939 DEBUG org.apache.hadoop.hbase.master.ServerManager: Server REPORT
rejected; currently processing dw88.kgb.sqa.cm4,60020,1351671478752 as dead server
> 2012-10-31 19:52:15,945 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
Skipping log splitting for dw88.kgb.sqa.cm4,60020,1351671478752
> {code}
> No log of assigning ROOT
> Regionserver log:
> {code}
> 2012-10-31 19:52:15,923 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 229128ms
instead of 100000ms, this is likely due to a long garbage collecting pause and it's usually
bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message