hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3382) Make HBase client work better under concurrent clients
Date Fri, 28 Jan 2011 02:06:43 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987883#action_12987883

ryan rawson commented on HBASE-3382:

Doing more testing with HBASE-2939 I ran some tests using YCSB, it was very confusing at first
because I wasnt getting the performance boost I was hoping for.

So with a configuration that does scan only load, I am seeing a base line performance of about
50-60ms for 1 thread.  Upping this to 10 threads the performance gets much worse, up to 400ms
or so.  Doing some custom tracing in our client code it revealed that the source of the slowness
was waiting for other responses to be streamed to the client.  That is thread1 asks for a
big fat reply, but it takes 100ms to read off the wire, thread2 which did a little-itty-bitty
request (close scanner for example), must wait or that 100ms thus being unnecessarily slowed

So I tried this patch with ThreadLocal, and while I see improvement I am not seeing enough
improvement, with lines like this:
[SCAN], AverageLatency(ms), 363.44
[SCAN], AverageLatency(ms), 448.31
[SCAN], AverageLatency(ms), 426.53

The data size is small enough and fully cached, and I added logging that verifies that we
are CREATING multiple connections (1 per thread it seems).

But the "call_wait" profile time (the time spent between sending the request and when the
connection code starts to receive our response) is pretty high, in previous tests I saw something
like this;
cat new_1thr.txt | perl -ne 'if(/call_wait\) split: (\d+?) /) { print $1/1000000, "\n";}'
| perl -ne '$sum += $_; $count++; END {print $sum/$count, "\n"}'
cat new_10thr.txt | perl -ne 'if(/call_wait\) split: (\d+?) /) { print $1/1000000, "\n";}'
| perl -ne '$sum += $_; $count++; END {print $sum/$count, "\n"}'

As you can see going from an average wait time of 3ms to an average wait time of 56ms is pretty

But using the work to add ThreadLocal connections I did not get as much boost as I hoped for,
instead I saw call_wait time like:

cat 10_thr.txt| perl -ne 'if(/call_wait\) split: (\d+?) /) { print $1/1000000, "\n";}' | perl
-ne '$sum += $_; $count++; END {print $sum/$count, "\n"}' 

while 19ms < 56ms that is still a lot of ms of wait.

At this point we might be also seeing server side slowness.  I think the next step is to extend
the NanoProfiler code into the server side so we can have extensive tracing between both the
server and client. 

This result suggests we are seeing server-side slowness under concurrency, which is reasonable,
but I wasn't seeing in previous profiler runs, but a lot of performance code has been committed
in the mean time.

> Make HBase client work better under concurrent clients
> ------------------------------------------------------
>                 Key: HBASE-3382
>                 URL: https://issues.apache.org/jira/browse/HBASE-3382
>             Project: HBase
>          Issue Type: Bug
>          Components: performance
>            Reporter: ryan rawson
>            Assignee: ryan rawson
>         Attachments: HBASE-3382-nio.txt, HBASE-3382.txt
> The HBase client uses 1 socket per regionserver for communication.  This is good for
socket control but potentially bad for latency.  How bad?  I did a simple YCSB test that had
this config:
>  readproportion=0
>  updateproportion=0
>  scanproportion=1
>  insertproportion=0
>  fieldlength=10
>  fieldcount=100
>  requestdistribution=zipfian
>  scanlength=300
>  scanlengthdistribution=zipfian
> I ran this with 1 and 10 threads.  The summary is as so:
> 1 thread:
> [SCAN]	 Operations	1000
> [SCAN]	 AverageLatency(ms)	35.871
> 10 threads:
> [SCAN]	 Operations	1000
> [SCAN]	 AverageLatency(ms)	228.576
> We are taking a 6.5x latency hit in our client.  But why?
> First step was to move the deserialization out of the Connection thread, this seemed
like it could have a big win, an analog change on the server side got a 20% performance improvement
(already commited as HBASE-2941).  I did this and got about a 20% improvement again, with
that 228ms number going to about 190 ms.  
> So I then wrote a high performance nanosecond resolution tracing utility.  Clients can
flag an API call, and we get tracing and numbers through the client pipeline.  What I found
is that a lot of time is being spent in receiving the response from the network.  The code
block is like so:
>         NanoProfiler.split(id, "receiveResponse");
>         if (LOG.isDebugEnabled())
>           LOG.debug(getName() + " got value #" + id);
>         Call call = calls.get(id);
>         size -= 4;  // 4 byte off for id because we already read it.
>         ByteBuffer buf = ByteBuffer.allocate(size);
>         IOUtils.readFully(in, buf.array(), buf.arrayOffset(), size);
>         buf.limit(size);
>         buf.rewind();
>         NanoProfiler.split(id, "setResponse", "Data size: " + size);
> I came up with some numbers:
> 11726 (receiveResponse) split: 64991689 overall: 133562895 Data size: 4288937
> 12163 (receiveResponse) split: 32743954 overall: 103787420 Data size: 1606273
> 12561 (receiveResponse) split: 3517940 overall: 83346740 Data size: 4
> 12136 (receiveResponse) split: 64448701 overall: 203872573 Data size: 3570569
> The first number is the internal counter for keeping requests unique from HTable on down.
 The numbers are in ns, the data size is in bytes.
> Doing some simple calculations, we see for the first line we were reading at about 31
MB/sec.  The second one is even worse.  Other calls are like:
> 26 (receiveResponse) split: 7985400 overall: 21546226 Data size: 850429
> which is 107 MB/sec which is pretty close to the maximum of gige.  In my set up, the
ycsb client ran on the master node and HAD to use network to talk to regionservers.
> Even at full line rate, we could still see unacceptable hold ups of unrelated calls that
just happen to need to talk to the same regionserver.
> This issue is about these findings, what to do, how to improve. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message