hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Zhong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
Date Mon, 08 Apr 2013 16:23:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13625509#comment-13625509
] 

Jeffrey Zhong commented on HBASE-7824:
--------------------------------------

[~zjushch] Thanks for the detailed reviewing!

For your first two comments, I'll make corresponding modifications.

{quote}
Should use the flag 'shouldSplitMetaSeparately' like other log-split?
{quote}
A good question. Since splitLog is a sync call, the following two calls 
{code}
      fileSystemManager.splitMetaLog(sn);
      fileSystemManager.splitLog(sn);
{code}
are logically equivalent to one splitAllLogs call while splitAllLogs has a little bit performance
advantage because it submits all log splitting logs in one go. 'shouldSplitMetaSeparately'
is significant in MetaSSH and SSH while in other places there is no difference logically.

Being said that, in some places I could take advantage by separating them to improve a little
bit more on master start up. As you know both features are new, so I choose conservative way
in the beginning and make them less dependent on each other.

{quote}
in AssignmentManager#processDeadServersAndRegionsInTransition, how about if we mark it as
a clean cluster startup?
if we mark it as a failover, is there any conflict between SSH and AssignmentManager#processDeadServersAndRecoverLostRegions
{quote}
If we have left log splitting work, it means that the new master start up isn't a clean one.
The reason to make it a failover is to let SSH(single place) to handle dead servers including
the log splitting we skipped at the very beginning. If we make the start up as a clean one,
we could have data loss as log splitting won't be done for some regions. 
During the AssignmentManager#processDeadServersAndRecoverLostRegions, there are existing implementations
intentionally skipping all known dead servers and leave them to SSH so there is no conflict.

{quote}
>From DeadServer#cleanPreviousInstance, a deadserver will be removed if the same HostnamePort
servername is online. 
{quote}
Good concern. The key point is that DeadServer#cleanPreviousInstance will be only called after
master initialization. By then, we don't rely on DeadServer much as far as master start up
concerns. After master is initialized, DeadServer is basically used in UI to show previously
dead servers, "YouAreDeadException" handling and prevent duplicated expireServer calls. As
you already know, once a dead server SSH is submitted, it will continue till it's done regardless
if it's in the DeadServer or not. This could happen today when a RS crashed sequentially while
its previous instances are still in SSH pipe no matter if DeadServer tracks them or not. In
short, DeadServer#cleanPreviousInstance doesn't have much impact.

                
> Improve master start up time when there is log splitting work
> -------------------------------------------------------------
>
>                 Key: HBASE-7824
>                 URL: https://issues.apache.org/jira/browse/HBASE-7824
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Jeffrey Zhong
>            Assignee: Jeffrey Zhong
>             Fix For: 0.94.8
>
>         Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch,
hbase-7824-v8.patch, hbase-7824-v9.patch
>
>
> When there is log split work going on, master start up waits till all log split work
completes even though the log split has nothing to do with meta region servers.
> It's a bad behavior considering a master node can run when log split is happening while
its start up is blocking by log split work. 
> Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message