Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 91912 invoked from network); 21 Jan 2011 05:36:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 21 Jan 2011 05:36:11 -0000 Received: (qmail 6375 invoked by uid 500); 21 Jan 2011 05:36:10 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 6289 invoked by uid 500); 21 Jan 2011 05:36:09 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 6281 invoked by uid 99); 21 Jan 2011 05:36:07 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jan 2011 05:36:07 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jan 2011 05:36:05 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id p0L5ZhSm005074 for ; Fri, 21 Jan 2011 05:35:44 GMT Message-ID: <12288755.98881295588143671.JavaMail.jira@thor> Date: Fri, 21 Jan 2011 00:35:43 -0500 (EST) From: "ryan rawson (JIRA)" To: issues@hbase.apache.org Subject: [jira] Commented: (HBASE-3382) Make HBase client work better under concurrent clients In-Reply-To: <3135185.253581292970601222.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12984576#action_12984576 ] ryan rawson commented on HBASE-3382: ------------------------------------ So it's pretty clear that to improve performance under load, we should be using multiple sockets. Here is a rough block diagram of how the client works: HTable -- calls --> HConnectionImplementation -- calls --> HBaseRPC.waitForProxy() In waitForProxy, a HBaseClient object is grabbed and associated with the proxy via the embedded Invoker object. Let's call this 'client' (as does the code) HCI -- calls -> ProxyObject (anonymous) -->client.call() Now a few notes: - The HCI will reuse the same proxy object a few times, if not a LOT of times. - The proxy object has 1 reference to 1 HBaseClient object. - The HBaseClient object has 1 socket/connection per Regionserver. Multiple threads will interleave their requests & replies (in any order, out of order replies ok) on the 1 socket. So there are a few different approaches, in HBASE-2939 a patch allows for every new call to grab a different connection off the pool, with different pool types. This has the disadvantage of needing 1 thread per extra socket to a RS. Another solution is to change the Connection object & thread to do async on multiple sockets to allow 1 thread per regionserver, but multiple sockets under it all. another solution is to use a nio framework to implement this instead of doing raw nio programming. > Make HBase client work better under concurrent clients > ------------------------------------------------------ > > Key: HBASE-3382 > URL: https://issues.apache.org/jira/browse/HBASE-3382 > Project: HBase > Issue Type: Bug > Components: performance > Reporter: ryan rawson > Assignee: ryan rawson > Attachments: HBASE-3382-nio.txt, HBASE-3382.txt > > > The HBase client uses 1 socket per regionserver for communication. This is good for socket control but potentially bad for latency. How bad? I did a simple YCSB test that had this config: > readproportion=0 > updateproportion=0 > scanproportion=1 > insertproportion=0 > fieldlength=10 > fieldcount=100 > requestdistribution=zipfian > scanlength=300 > scanlengthdistribution=zipfian > I ran this with 1 and 10 threads. The summary is as so: > 1 thread: > [SCAN] Operations 1000 > [SCAN] AverageLatency(ms) 35.871 > 10 threads: > [SCAN] Operations 1000 > [SCAN] AverageLatency(ms) 228.576 > We are taking a 6.5x latency hit in our client. But why? > First step was to move the deserialization out of the Connection thread, this seemed like it could have a big win, an analog change on the server side got a 20% performance improvement (already commited as HBASE-2941). I did this and got about a 20% improvement again, with that 228ms number going to about 190 ms. > So I then wrote a high performance nanosecond resolution tracing utility. Clients can flag an API call, and we get tracing and numbers through the client pipeline. What I found is that a lot of time is being spent in receiving the response from the network. The code block is like so: > NanoProfiler.split(id, "receiveResponse"); > if (LOG.isDebugEnabled()) > LOG.debug(getName() + " got value #" + id); > Call call = calls.get(id); > size -= 4; // 4 byte off for id because we already read it. > ByteBuffer buf = ByteBuffer.allocate(size); > IOUtils.readFully(in, buf.array(), buf.arrayOffset(), size); > buf.limit(size); > buf.rewind(); > NanoProfiler.split(id, "setResponse", "Data size: " + size); > I came up with some numbers: > 11726 (receiveResponse) split: 64991689 overall: 133562895 Data size: 4288937 > 12163 (receiveResponse) split: 32743954 overall: 103787420 Data size: 1606273 > 12561 (receiveResponse) split: 3517940 overall: 83346740 Data size: 4 > 12136 (receiveResponse) split: 64448701 overall: 203872573 Data size: 3570569 > The first number is the internal counter for keeping requests unique from HTable on down. The numbers are in ns, the data size is in bytes. > Doing some simple calculations, we see for the first line we were reading at about 31 MB/sec. The second one is even worse. Other calls are like: > 26 (receiveResponse) split: 7985400 overall: 21546226 Data size: 850429 > which is 107 MB/sec which is pretty close to the maximum of gige. In my set up, the ycsb client ran on the master node and HAD to use network to talk to regionservers. > Even at full line rate, we could still see unacceptable hold ups of unrelated calls that just happen to need to talk to the same regionserver. > This issue is about these findings, what to do, how to improve. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.