hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Performance degradation between CDH5.3.1(HBase0.98.6) and CDH5.4.5(HBase1.0.0)
Date Sun, 13 Sep 2015 05:11:06 GMT
Thank you for the below reasoning (with accompanying helpful diagram).
Makes sense. Let me hack up a test case to help with the illustration. It
is as though the mvcc should be scoped to a row only... Writes against
other rows should not hold up my read of my row. Tag an mvcc with a 'row'
scope so we can see which on-going writes pertain to current operation?

You need to read back the increment and have it be 'correct' at increment
time?

(This is a good one)

Thank you Toshihiro Suzuki
St.Ack

On Sat, Sep 12, 2015 at 8:09 AM, 鈴木俊裕 <brfrn169@gmail.com> wrote:

> St.Ack,
>
> Thank you for your response.
>
> Why I make out that "A region lock (not a row lock) seems to occur in
> waitForPreviousTransactionsComplete()" is as follows:
>
> A increment operation has 3 procedures for MVCC.
>
> 1. mvcc.waitForPreviousTransactionsComplete();
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6712
>
> 2. w = mvcc.beginMemstoreInsertWithSeqNum(mvccNum);
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6721
>
> 3. mvcc.completeMemstoreInsertWithSeqNum(w, walKey);
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6893
>
>
> I think that MultiVersionConsistencyControl's writeQueue can cause a region
> lock.
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L42-L43
>
>
> Step 2 adds to a WriteEntry to writeQueue.
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L102-L108
>
> Step 3 removes the WriteEntry from writeQueue.
>
> completeMemstoreInsertWithSeqNum(w, walKey) ->
> waitForPreviousTransactionsComplete(e) -> advanceMemstore(w)
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L127
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L235
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L160
>
> Step 1 adds a WriteEntry w in beginMemstoreInsert() to writeQueue and waits
> until writeQueue is empty or writeQueue.getFirst() == w.
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L201-L204
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L206-L241
>
>
> I think when a handler thread is processing between step 2 and step 3, the
> other handler threads can wait at step 1 until the thread completes step 3
> This is depicted as follows:
>
>
> https://gist.githubusercontent.com/brfrn169/cb4f2c157129330cd932/raw/86d6aae5667b0fe006b16fed80f1b0c4945c7fd0/region_lock.png
>
>
> Actually, in the thread dump of our region server, many handler threads
> (RW.default.writeRpcServer.handler) wait at Step 1
> (waitForPreviousTransactionsComplete()).
>
>
> https://gist.githubusercontent.com/brfrn169/cb4f2c157129330cd932/raw/86d6aae5667b0fe006b16fed80f1b0c4945c7fd0/thread_dump.txt
>
> Many handler threads wait at this:
>
>
> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L224
>
>
> > Is it possible you are contending on a counter post-upgrade?  Is it
> > possible that all these threads are trying to get to the same row to
> update
> > it? Could the app behavior have changed?  Or are you thinking increment
> > itself has slowed significantly?
> We have just upgraded HBase, not changed the app behavior. We are thinking
> increment itself has slowed significantly.
> Before upgrading HBase, it was good throughput and latency.
> Currently, to cope with this problem, we split the regions finely.
>
> Thanks,
>
> Toshihiro Suzuki
>
>
> 2015-09-09 15:29 GMT+09:00 Stack <stack@duboce.net>:
>
> > On Tue, Sep 8, 2015 at 10:22 PM, 鈴木俊裕 <brfrn169@gmail.com> wrote:
> >
> > > Ted,
> > >
> > > Thank you for your response.
> > >
> > > I uploaded the complete stack trace to Gist.
> > >
> > > https://gist.github.com/brfrn169/cb4f2c157129330cd932
> > >
> > >
> > > I think that increment operation works as follows:
> > >
> > > 1. get row lock
> > > 2. mvcc.waitForPreviousTransactionsComplete() // wait for all prior
> MVCC
> > > transactions to finish
> > > 3. mvcc.beginMemstoreInsertWithSeqNum() // start a transaction
> > > 4. get previous values
> > > 5. create KVs
> > > 6. write to Memstore
> > > 7. write to WAL
> > > 8. release row lock
> > > 9. mvcc.completeMemstoreInsertWithSeqNum() // complete the transaction
> > >
> > > A instance of MultiVersionConsistencyControl has a pending queue of
> > writes
> > > named writeQueue.
> > > Step 2 puts a WriteEntry w to writeQueue and waits until writeQueue is
> > > empty or writeQueue.getFirst() == w.
> > > Step 3 puts a WriteEntry to writeQueue and step 9 removes the
> WriteEntry
> > > from writeQueue.
> > >
> > > I think that when a handler thread is processing between step 2 and
> step
> > 9,
> > > the other handler threads can wait until the thread completes step 9.
> > >
> > >
> > That is right. We need to read, after all outstanding updates are done...
> > because we need to read the latest update before we go to
> modify/increment
> > it.
> >
> > How do you make out this?
> >
> > "A region lock (not a row lock) seems to occur in
> > waitForPreviousTransactionsComplete()."
> >
> > In 0.98.x we did this:
> >
> > mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
> >
> > ... and in 1.0 we do this:
> >
> > mvcc.waitForPreviousTransactionsComplete() which is this....
> >
> > +  public void waitForPreviousTransactionsComplete() {
> > +    WriteEntry w = beginMemstoreInsert();
> > +    waitForPreviousTransactionsComplete(w);
> > +  }
> >
> > The mvcc and region sequenceid were merged in 1.0 (
> > https://issues.apache.org/jira/browse/HBASE-8763). Previous mvcc and
> > region
> > sequenceid would spin independent of each other. Perhaps this responsible
> > for some slow down.
> >
> > That said, looking in your thread dump, we seem to be down in the Get. If
> > you do a bunch of thread dumps in a row, where is the lock-holding
> thread?
> > In Get or writing Increment... or waiting on sequence id?
> >
> > Is it possible you are contending on a counter post-upgrade?  Is it
> > possible that all these threads are trying to get to the same row to
> update
> > it? Could the app behavior have changed?  Or are you thinking increment
> > itself has slowed significantly?
> >
> > St.Ack
> >
> >
> >
> >
> > > Thanks,
> > >
> > > Toshihiro Suzuki
> > >
> > >
> > > 2015-09-09 0:05 GMT+09:00 Ted Yu <yuzhihong@gmail.com>:
> > >
> > > > In HRegion#increment(), we lock the row (not region):
> > > >
> > > >     try {
> > > >       rowLock = getRowLock(row);
> > > >
> > > > Can you pastebin the complete stack trace ?
> > > >
> > > > Thanks
> > > >
> > > > On Tue, Sep 8, 2015 at 2:01 AM, 鈴木俊裕 <brfrn169@gmail.com>
wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We upgraded our cluster from CDH5.3.1(HBase0.98.6) to
> > > > CDH5.4.5(HBase1.0.0)
> > > > > and we experience slowdown in increment operation.
> > > > >
> > > > > Here's an extract from thread dump of the RegionServer of our
> > cluster:
> > > > >
> > > > > Thread 68
> (RW.default.writeRpcServer.handler=15,queue=5,port=60020):
> > > > >   State: BLOCKED
> > > > >   Blocked count: 21689888
> > > > >   Waited count: 39828360
> > > > >   Blocked on java.util.LinkedList@3474e4b2
> > > > >   Blocked by 63
> > > (RW.default.writeRpcServer.handler=10,queue=0,port=60020)
> > > > >   Stack:
> > > > >     java.lang.Object.wait(Native Method)
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:224)
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:203)
> > > > >
> > > > >
> > >
> org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:6712)
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:501)
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:570)
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:1901)
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31451)
> > > > >     org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
> > > > >     org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
> > > > >
> >  org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
> > > > >     java.lang.Thread.run(Thread.java:745)
> > > > >
> > > > > There are many similar threads in the thread dump.
> > > > >
> > > > > I read the source code and I think this is caused by changes of
> > > > > MultiVersionConsistencyControl.
> > > > > A region lock (not a row lock) seems to occur in
> > > > > waitForPreviousTransactionsComplete().
> > > > >
> > > > >
> > > > > Also we wrote performance test code for increment operation that
> > > included
> > > > > 100 threads and ran it in local mode.
> > > > >
> > > > > The result is shown below:
> > > > >
> > > > > CDH5.3.1(HBase0.98.6)
> > > > > Throughput(op/s): 12757, Latency(ms): 7.975072509210629
> > > > >
> > > > > CDH5.4.5(HBase1.0.0)
> > > > > Throughput(op/s): 2027, Latency(ms): 49.11840157868772
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Toshihiro Suzuki
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message