hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4742) Split dead server's log in parallel
Date Sat, 05 Nov 2011 22:58:53 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144863#comment-13144863

Phabricator commented on HBASE-4742:

Liyin has commented on the revision "[jira] [HBASE-4742] Split dead server's log in parallel".

  Thank Mikhail for your quick response.
  We have agreed on most of the discussion here.

  The remaining discussion is focusing on the number of threads launched in master for splitting
dead servers log, which has made me re-considering our motivation about parallel distributed
log splitting here.

  Our basic motivation is splitting log should not block the region server process queue.
Also the distributed log splitting itself is designed to split log for a large number region
servers. So we could batch all the dead region servers together into a queue and launch single
thread to do the distributed log splitting,  instead of distribute log splitting for each
dead server as a separate thread.

  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:333 Thank you for
clarify your concern:)

  For each dead server, the master would receive the znode expire event for this dead server
only once.
  So the master wouldn't have 2 threads split the same dead region server at same time.
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:337 Fine. I would
change it to "Succeeded in splitting".
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:347-348 Thanks for
  src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java:135 Actually,
I don't have to catch the exceptions here explicitly.
  It won't affect the unit test results.
  Thanks for the discussion.
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:312 1) We use distributed
log splitting for the dead region server as well.

  2) Even though we use thread pool to execute, I would bound the max thread as the number
of region server.
  What do you think of the max thread we should bound for the execute thread pool here?

  Also as your example mentioned here, 500 region server went down. The master would launch
500 threads to distributed log splitting in parallel. It won't choke the master too much since
the split job is done on each region server side.

  3) But this discussion also leads us to another good point. Let's say if there are a large
number region server dead for some reason. Shall we batch these dead region servers to split
instead of splitting their log in parallel.

  Any ideas? Mikhail and Prakash ?


> Split dead server's log in parallel
> -----------------------------------
>                 Key: HBASE-4742
>                 URL: https://issues.apache.org/jira/browse/HBASE-4742
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>         Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch
> When one region server goes down, the master will shutdown the region server and split
its log.
> However, splitting log is a blocking call and it would take some time.
> If more than one region server go down, the master will split its log one by one, which
is not efficient.
> Since we have the distributed log split, we could split these logs from the dead servers
in parallel. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message