hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4988) MetaServer crash cause all splitting regionserver abort
Date Fri, 09 Dec 2011 06:59:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165898#comment-13165898
] 

chunhui shen commented on HBASE-4988:
-------------------------------------

@stack

The case happens in our test environment which use 0.90 version.

If .META. Server is killed and not started immediately.
the step MetaEditor.offlineParentInMeta will fail and throw exception,
and the JournalEntry.PONR causes server abort when rolling back.

In Trunk version, MetaEditor.offlineParentInMeta will retry, but the parent region can't on
service for a long time, I think it is unacceptable. Also the retry would be failed, and cause
server abort finally.

{code}metaServer.put(CatalogTracker.META_REGION_NAME, put);{code}
If the .meta. server die between verification and doing put above, it will abort because we
can't ensure whether update .meta. successfully. However, if we can find that .meta. server
is not ok now first, we needn't abort server which is doing split
                
> MetaServer crash cause all splitting regionserver abort
> -------------------------------------------------------
>
>                 Key: HBASE-4988
>                 URL: https://issues.apache.org/jira/browse/HBASE-4988
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>         Attachments: hbase-4988v1.patch
>
>
> If metaserver crash now,
> All the splitting regionserver will abort theirself.
> Becasue the code
> {code}
> this.journal.add(JournalEntry.PONR);
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>             this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> {code}
> If the JournalEntry is PONR, split's roll back will abort itselef.
> It is terrible in huge putting environment when metaserver crash

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message