Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 741941158A for ; Tue, 17 Jun 2014 03:18:02 +0000 (UTC) Received: (qmail 70519 invoked by uid 500); 17 Jun 2014 03:18:02 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 70245 invoked by uid 500); 17 Jun 2014 03:18:02 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 70180 invoked by uid 99); 17 Jun 2014 03:18:02 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jun 2014 03:18:02 +0000 Date: Tue, 17 Jun 2014 03:18:02 +0000 (UTC) From: "Liang Xie (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-11355) a couple of callQueue related improvements MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033389#comment-14033389 ] Liang Xie commented on HBASE-11355: ----------------------------------- I don't have a normal 0.94 patch, it's a preliminary hack. Other hotspots includes: responseQueuesSizeThrottler, rpcMetrics, scannerReadPoints, etc. The minor change about callQueue like below(we had seperated the original callQueue into readCallQueue and writeCallQueue): {code} - protected BlockingQueue readCallQueue; // read queued calls + protected List> readCallQueues; // read queued calls ... - boolean success = readCallQueue.offer(call); + boolean success = readCallQueues.get(rand.nextInt(readHandlerCount)).offer(call); ... - this.readCallQueue = new LinkedBlockingQueue(readQueueLength); + this.readHandlerCount = Math.round(readQueueRatio * handlerCount); + this.readCallQueues = new LinkedList>(); + for (int i=0; i< readHandlerCount; i++) { + readCallQueues.add(new LinkedBlockingQueue(readQueueLength)) ; + } {code} Every handler thread will consume its own queue, to eliminate the severe contention. If considering correctness or more resource consumption, another call queue sharding solution here probably is introducing a queue number setting(i just took handler number for simplify to get a raw perf number), and letting all requests from same client go to the same queue always. > a couple of callQueue related improvements > ------------------------------------------ > > Key: HBASE-11355 > URL: https://issues.apache.org/jira/browse/HBASE-11355 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC > Affects Versions: 0.99.0, 0.94.20 > Reporter: Liang Xie > Assignee: Matteo Bertozzi > > In one of my in-memory read only testing(100% get requests), one of the top scalibility bottleneck came from the single callQueue. A tentative sharing this callQueue according to the rpc handler number showed a big throughput improvement(the original get() qps is around 60k, after this one and other hotspot tunning, i got 220k get() qps in the same single region server) in a YCSB read only scenario. > Another stuff we can do is seperating the queue into read call queue and write call queue, we had done it in our internal branch, it would helpful in some outages, to avoid all read or all write requests ran out of all handler threads. > One more stuff is changing the current blocking behevior once the callQueue is full, considering the full callQueue almost means the backend processing is slow somehow, so a fail-fast here should be more reasonable if we using HBase as a low latency processing system. see "callQueue.put(call)" -- This message was sent by Atlassian JIRA (v6.2#6252)