hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-14460) [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed Increments, CheckAndPuts, batch operations
Date Mon, 14 Dec 2015 05:55:47 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-14460:
    Attachment: 0.98.test.patch

If I run a test that has 100 threads each updating their own rows -- i.e. no contention on
a row -- then I see master branch completing before 0.94 does; i.e. master is faster. This
is in spite of the thread dump resembling that reported as problematic up top of this issue.

In 0.94, all are stuck waiting on the WAL syncer to come in:
"50" #74 daemon prio=5 os_prio=0 tid=0x00007f7a78661000 nid=0x3364 waiting for monitor entry
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1334)
        - waiting to lock <0x00000004cde22390> (a java.lang.Object)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1476)
        at org.apache.hadoop.hbase.regionserver.HRegion.syncOrDefer(HRegion.java:6160)
        at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:5571)
        at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:5454)
        at org.apache.hadoop.hbase.regionserver.TestIncrement$SingleCellIncrementer.run(TestIncrement.java:84)

In master they are stuck here:
"17" #55 daemon prio=5 os_prio=0 tid=0x00007f0374c6d000 nid=0x3a0b in Object.wait() [0x00007f030c346000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.lang.Object.wait(Native Method)
        at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.waitForRead(MultiVersionConcurrencyControl.java:218)
        - locked <0x00000004d2e26208> (a java.lang.Object)
        at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.completeAndWait(MultiVersionConcurrencyControl.java:149)
        at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.await(MultiVersionConcurrencyControl.java:137)
        at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7360)
        at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7315)
        at org.apache.hadoop.hbase.regionserver.TestIncrement$SingleCellIncrementer.run(TestIncrement.java:86)

The flame graphs show basically the same profile across all verisons (master spends a bit
less time appending which I suppose is expected).

> [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed Increments,
CheckAndPuts, batch operations
> -------------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-14460
>                 URL: https://issues.apache.org/jira/browse/HBASE-14460
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>         Attachments: 0.94.test.patch, 0.98.test.patch, 14460.txt, flamegraph-13120.svg.master.singlecell.svg,
flamegraph-26636.094.100.svg, flamegraph-28066.098.singlecell.svg, flamegraph-28767.098.100.svg,
flamegraph-31647.master.100.svg, flamegraph-9466.094.singlecell.svg, m.test.patch, region_lock.png,
testincrement.094.patch, testincrement.098.patch, testincrement.master.patch
> As reported by 鈴木俊裕 up on the mailing list -- see "Performance degradation between
CDH5.3.1(HBase0.98.6) and CDH5.4.5(HBase1.0.0)" -- our unification of sequenceid and MVCC
slows Increments (and other ops) as the mvcc needs to 'catch up' to our current point before
we can read the last Increment value that we need to update.
> We can say that our Increment is just done wrong, we should just be writing Increments
and summing on read, but checkAndPut as well as batching operations have the same issue. Fix.

This message was sent by Atlassian JIRA

View raw message