Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Fri, 5 Feb 2016 08:41:40 +0000 (UTC)
From: "stack (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12936660.1454577860000.309445.1454661700021@Atlassian.JIRA>
In-Reply-To: <JIRA.12936660.1454577860000@Atlassian.JIRA>
References: <JIRA.12936660.1454577860000@Atlassian.JIRA>
 <JIRA.12936660.1454577860299@arcas>
Subject: [jira] [Commented] (HBASE-15213) Fix increment performance
 regression caused by HBASE-8763 on branch-1.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-15213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133842#comment-15133842 ] 

stack commented on HBASE-15213:
-------------------------------

Ok. I got below numbers. This patch is great because it does away w/ the need of hbase.increment.fast.but.narrow.consistency on branch 1.0 and 1.1 (and branch-1.2 -- and it seems it is not needed there anyways). It also is better because it addresses appends and checkAndPut (I'm pretty sure -- I will verify).  Let me apply this patch in the morning. I want to update release notes to sing praises of this fix and to do proper messaging; it'll easy for users to get confused. Let me try and take care of that.

Nice one [~junegunn] You sure there ain't other places you'd like to go digging in (smile).

Before HBASE-8763: d6cc2fb
{code}
2016-02-05 00:18:58,665 WARN  [pool-1-thread-78] hbase.HBaseConfiguration: hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.memstore.size
2016-02-05 00:19:55,814 INFO  [main] hbase.IncrementPerformanceTest: 75th=5.329, 95th=9.355449999999998, 99th=19.42884
{code}

Tip of 1.0 branch
{code}
2016-02-05 00:22:44,475 INFO  [pool-1-thread-78-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x152b085e9ae0007, negotiated timeout = 40000
2016-02-05 00:26:02,829 INFO  [main] hbase.IncrementPerformanceTest: 75th=26.22475, 95th=36.996399999999994, 99th=44.89627000000002
{code}

Tip of 1.0 branch with  hbase.increment.fast.but.narrow.consistency
{code}
2016-02-05 00:28:57,215 INFO  [pool-1-thread-4-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x152b08c36410006, negotiated timeout = 40000
2016-02-05 00:29:50,879 INFO  [main] hbase.IncrementPerformanceTest: 75th=5.1594999999999995, 95th=11.920049999999998, 99th=34.12252000000004
{code}

Tip of 1.0 with patch applied
{code}
2016-02-05 00:35:27,268 INFO  [pool-1-thread-64-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x152b091faec0006, negotiated timeout = 40000
2016-02-05 00:36:20,006 INFO  [main] hbase.IncrementPerformanceTest: 75th=5.039, 95th=9.919499999999992, 99th=20.12136000000002
{code}


> Fix increment performance regression caused by HBASE-8763 on branch-1.0
> -----------------------------------------------------------------------
>
>                 Key: HBASE-15213
>                 URL: https://issues.apache.org/jira/browse/HBASE-15213
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Junegunn Choi
>            Assignee: Junegunn Choi
>         Attachments: HBASE-15213-increment.png, HBASE-15213.branch-1.0.patch
>
>
> This is an attempt to fix the increment performance regression caused by HBASE-8763 on branch-1.0.
> I'm aware that hbase.increment.fast.but.narrow.consistency was added to branch-1.0 (HBASE-15031) to address the issue and a separate work is ongoing on master branch, but anyway, this is my take on the problem.
> I read through HBASE-14460 and HBASE-8763 but it wasn't clear to me what caused the slowdown but I could indeed reproduce the performance regression.
> Test setup:
> - Server: 4-core Xeon 2.4GHz Linux server running mini cluster (100 handlers, JDK 1.7)
> - Client: Another box of the same spec
> - Increments on random 10k records on a single-region table, recreated every time
> Increment throughput (TPS):
> || Num threads || Before HBASE-8763 (d6cc2fb) || branch-1.0 || branch-1.0 (narrow-consistency) ||
> || 1            | 2661                         | 2486        | 2359  |
> || 2            | 5048                         | 5064        | 4867  |
> || 4            | 7503                         | 8071        | 8690  |
> || 8            | 10471                        | 10886       | 13980 |
> || 16           | 15515                        | 9418        | 18601 |
> || 32           | 17699                        | 5421        | 20540 |
> || 64           | 20601                        | 4038        | 25591 |
> || 96           | 19177                        | 3891        | 26017 |
> We can clearly observe that the throughtput degrades as we increase the number of concurrent requests, which led me to believe that there's severe context switching overhead and I could indirectly confirm that suspicion with cs entry in vmstat output. branch-1.0 shows a much higher number of context switches even with much lower throughput.
> Here are the observations:
> - WriteEntry in the writeQueue can only be removed by the very handler that put it, only when it is at the front of the queue and marked complete.
> - Since a WriteEntry is marked complete after the wait-loop, only one entry can be removed at a time.
> - This stringent condition causes O(N^2) context switches where n is the number of concurrent handlers processing requests.
> So what I tried here is to mark WriteEntry complete before we go into wait-loop. With the change, multiple WriteEntries can be shifted at a time without context switches. I changed writeQueue to LinkedHashSet since fast containment check is needed as WriteEntry can be removed by any handler.
> The numbers look good, it's virtually identical to pre-HBASE-8763 era.
> || Num threads || branch-1.0 with fix ||
> || 1            | 2459                 |
> || 2            | 4976                 |
> || 4            | 8033                 |
> || 8            | 12292                |
> || 16           | 15234                |
> || 32           | 16601                |
> || 64           | 19994                |
> || 96           | 20052                |
> So what do you think about it? Please let me know if I'm missing anything.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)