hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Appy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19290) Reduce zk request when doing split log
Date Tue, 21 Nov 2017 01:09:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16260121#comment-16260121

Appy commented on HBASE-19290:

bq. int sleepTime = RandomUtils.nextInt(0, 100) + 500;
Why randomize? Can be constant? 

bq if (taskGrabbed == 0 && !shouldStop) {
So there are 2 available splitters, and one grabbed task, we don't stop here and keep hammering
Probably change taskGrabbed 

int idx = (i + offset) % paths.size();
446	        // don't call ZKSplitLog.getNodeName() because that will lead to
447	        // double encoding of the path name
448	        taskGrabbed += grabTask(ZNodePaths.joinZNode(watcher.znodePaths.splitLogZNode,
paths.get(idx))) ? 1 : 0;
Can do it in "if" condition itself?

bq. taskReadySeq.wait may not execute because it has condition.
That while condition is just to handle spurious wakeups.  See Object#wait. You can definitely
remove the second sleep (unless there's a concrete reason not to).

> Reduce zk request when doing split log
> --------------------------------------
>                 Key: HBASE-19290
>                 URL: https://issues.apache.org/jira/browse/HBASE-19290
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-19290.master.001.patch, HBASE-19290.master.002.patch
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort and doing
split log, the split is very very slow, and we find the regionserver and master wait on the
zookeeper response, so we need to reduce zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will get rsZNode's
children from zookeeper, when cluster is huge, this is heavy. This patch reduce the request.

> (2) When the regionserver has max split tasks running, it may still trying to grab task
and issue zookeeper request, we should sleep and wait until we can grab tasks again.  

This message was sent by Atlassian JIRA

View raw message