Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B820518E04 for ; Mon, 14 Dec 2015 05:55:47 +0000 (UTC) Received: (qmail 54007 invoked by uid 500); 14 Dec 2015 05:55:47 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 53958 invoked by uid 500); 14 Dec 2015 05:55:47 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 53935 invoked by uid 99); 14 Dec 2015 05:55:47 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Dec 2015 05:55:47 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 0AF2D2C1F8B for ; Mon, 14 Dec 2015 05:55:47 +0000 (UTC) Date: Mon, 14 Dec 2015 05:55:47 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-14460) [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed Increments, CheckAndPuts, batch operations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14460?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14460: -------------------------- Attachment: 0.98.test.patch m.test.patch 0.94.test.patch flamegraph-26636.094.100.svg flamegraph-28767.098.100.svg flamegraph-31647.master.100.svg If I run a test that has 100 threads each updating their own rows -- i.e. n= o contention on a row -- then I see master branch completing before 0.94 do= es; i.e. master is faster. This is in spite of the thread dump resembling t= hat reported as problematic up top of this issue. In 0.94, all are stuck waiting on the WAL syncer to come in: {code} "50" #74 daemon prio=3D5 os_prio=3D0 tid=3D0x00007f7a78661000 nid=3D0x3364 = waiting for monitor entry [0x00007f7a30ecd000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1= 334) - waiting to lock <0x00000004cde22390> (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:147= 6) at org.apache.hadoop.hbase.regionserver.HRegion.syncOrDefer(HRegion= .java:6160) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.j= ava:5571) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.j= ava:5454) at org.apache.hadoop.hbase.regionserver.TestIncrement$SingleCellInc= rementer.run(TestIncrement.java:84) {code} In master they are stuck here: {code} "17" #55 daemon prio=3D5 os_prio=3D0 tid=3D0x00007f0374c6d000 nid=3D0x3a0b = in Object.wait() [0x00007f030c346000] java.lang.Thread.State: BLOCKED (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyCont= rol.waitForRead(MultiVersionConcurrencyControl.java:218) - locked <0x00000004d2e26208> (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyCont= rol.completeAndWait(MultiVersionConcurrencyControl.java:149) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyCont= rol.await(MultiVersionConcurrencyControl.java:137) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.j= ava:7360) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.j= ava:7315) at org.apache.hadoop.hbase.regionserver.TestIncrement$SingleCellInc= rementer.run(TestIncrement.java:86) {code The flame graphs show basically the same profile across all verisons (maste= r spends a bit less time appending which I suppose is expected). > [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed = Increments, CheckAndPuts, batch operations > -------------------------------------------------------------------------= ------------------------------------------ > > Key: HBASE-14460 > URL: https://issues.apache.org/jira/browse/HBASE-14460 > Project: HBase > Issue Type: Bug > Components: Performance > Reporter: stack > Assignee: stack > Priority: Critical > Attachments: 0.94.test.patch, 0.98.test.patch, 14460.txt, flamegr= aph-13120.svg.master.singlecell.svg, flamegraph-26636.094.100.svg, flamegra= ph-28066.098.singlecell.svg, flamegraph-28767.098.100.svg, flamegraph-31647= .master.100.svg, flamegraph-9466.094.singlecell.svg, m.test.patch, region_l= ock.png, testincrement.094.patch, testincrement.098.patch, testincrement.ma= ster.patch > > > As reported by =E9=88=B4=E6=9C=A8=E4=BF=8A=E8=A3=95 up on the mailing lis= t -- see "Performance degradation between CDH5.3.1(HBase0.98.6) and CDH5.4.= 5(HBase1.0.0)" -- our unification of sequenceid and MVCC slows Increments (= and other ops) as the mvcc needs to 'catch up' to our current point before = we can read the last Increment value that we need to update. > We can say that our Increment is just done wrong, we should just be writi= ng Increments and summing on read, but checkAndPut as well as batching oper= ations have the same issue. Fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)