hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9736) Alow more than one log splitter per RS
Date Sat, 23 Nov 2013 01:20:36 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830485#comment-13830485
] 

stack commented on HBASE-9736:
------------------------------

Hmm.. My comments got lost... Let me redo.  First, I'm trying it.  Will report back.

+        Random r = new Random();
+        int sleepTime = r.nextInt(500) + 500;
+        Thread.sleep(sleepTime);
+      } catch (InterruptedException e) {
+        LOG.warn("Interrupted while yielding for other region servers", e);
+        Thread.currentThread().interrupt();

Random is expensive to make.  Keep around an instance?  Seed it too else all the Random's
march in lock step?

FYI, there is a sleep in Threads that does the above if you want to use that instead.

Do a define for this:

+    return (-1);

... since you repeat it in a few places?

Patch looks great.

> Alow more than one log splitter per RS
> --------------------------------------
>
>                 Key: HBASE-9736
>                 URL: https://issues.apache.org/jira/browse/HBASE-9736
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR
>            Reporter: stack
>            Assignee: Jeffrey Zhong
>            Priority: Critical
>         Attachments: hbase-9736.patch
>
>
> IIRC, this is an idea that came from the lads at Xiaomi.
> I have a small cluster of 6 RSs and one went down.  It had a few WALs.  I see this in
logs:
> 2013-10-09 05:47:27,890 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks
= 25 unassigned = 21
> WAL splitting is held up for want of slots out on the cluster to split WALs.
> We need to be careful we don't overwhelm the foreground regionservers but more splitters
should help get all back online faster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message