hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: Performance degradation between CDH5.3.1(HBase0.98.6) and CDH5.4.5(HBase1.0.0)
Date Tue, 01 Dec 2015 01:05:37 GMT
Those log lines have settled down, they may have been related to a
cluster-wide forced restart at the time.

On Mon, Nov 30, 2015 at 7:59 PM Bryan Beaudreault <bbeaudreault@hubspot.com>
wrote:

> We've been doing more debugging of this and have set up the read vs write
> handlers to try to at least segment this away so reads can work. We have
> pretty beefy servers, and are running wiht the following settings:
>
> hbase.regionserver.handler.count=1000
> hbase.ipc.server.read.threadpool.size=50
> hbase.ipc.server.callqueue.handler.factor=0.025
> hbase.ipc.server.callqueue.read.ratio=0.6
> hbase.ipc.server.callqueue.scan.ratio=0.5
>
> We are seeing all 400 write handlers taken up by row locks for the most
> part. The read handlers are mostly idle. We're thinking of changing the
> ratio here, but are not sure it will help if they are all blocked on a row
> lock.  We just enabled DEBUG logging on all our servers and notice the
> following:
>
> 2015-12-01 00:56:09,015 DEBUG
> org.apache.hadoop.hbase.regionserver.ServerNonceManager: Conflict detected
> by nonce: [-687451119961178644:7664336281906118656], [state 0, hasWait
> false, activity 00:54:36.240]
> 2015-12-01 00:56:09,015 DEBUG
> org.apache.hadoop.hbase.regionserver.ServerNonceManager: Conflict detected
> by nonce: [-687451119961178644:-7119840249342174227], [state 0, hasWait
> false, activity 00:54:36.256]
> 2015-12-01 00:56:09,268 DEBUG
> org.apache.hadoop.hbase.regionserver.ServerNonceManager: Conflict detected
> by nonce: [-5946137511131403479:2112661701888365489], [state 0, hasWait
> false, activity 00:55:01.259]
> 2015-12-01 00:56:09,279 DEBUG
> org.apache.hadoop.hbase.regionserver.ServerNonceManager: Conflict detected
> by nonce: [4165332617675853029:6256955295384472057], [state 0, hasWait
> false, activity 00:53:58.151]
> 2015-12-01 00:56:09,279 DEBUG
> org.apache.hadoop.hbase.regionserver.ServerNonceManager: Conflict detected
> by nonce: [4165332617675853029:4961178013070912522], [state 0, hasWait
> false, activity 00:53:58.162]
>
>
> On Mon, Nov 30, 2015 at 6:11 PM Bryan Beaudreault <
> bbeaudreault@hubspot.com> wrote:
>
>> Sorry the second link should be
>> https://gist.github.com/bbeaudreault/2994a748da83d9f75085#file-gistfile1-txt-L579
>>
>> On Mon, Nov 30, 2015 at 6:10 PM Bryan Beaudreault <
>> bbeaudreault@hubspot.com> wrote:
>>
>>> https://gist.github.com/bbeaudreault/2994a748da83d9f75085
>>>
>>> An active handler:
>>> https://gist.github.com/bbeaudreault/2994a748da83d9f75085#file-gistfile1-txt-L286
>>> One that is locked:
>>> https://git.hubteam.com/gist/jwilliams/80f37999bfdf55119588#file-gistfile1-txt-L579
>>>
>>> The difference between pre-rollback and post is that previously we were
>>> seeing things blocked in mvcc.  Now we are seeing them blocked on the
>>> upsert.
>>>
>>> It always follows the same pattern, of 1 active handler in the upsert
>>> and the rest blocked waiting for it.
>>>
>>> On Mon, Nov 30, 2015 at 6:05 PM Stack <stack@duboce.net> wrote:
>>>
>>>> On Mon, Nov 30, 2015 at 2:31 PM, Bryan Beaudreault <
>>>> bbeaudreault@hubspot.com
>>>> > wrote:
>>>>
>>>> > The rollback seems to have mostly solved the issue for one of our
>>>> clusters,
>>>> > but another one is still seeing long increment times:
>>>> >
>>>> > "slowIncrementCount": 52080,
>>>> > "Increment_num_ops": 325236,"Increment_min": 1,"Increment_max": 6162,"
>>>> > Increment_mean": 465.68678129112396,"Increment_median": 216,"
>>>> > Increment_75th_percentile": 450.25,"Increment_95th_percentile":
>>>> > 1052.6499999999999,"Increment_99th_percentile": 1635.2399999999998
>>>> >
>>>> >
>>>> > Any ideas if there are other changes that may be causing a performance
>>>> > regression for increments between CDH4.7.1 and CDH5.3.8?
>>>> >
>>>> >
>>>> >
>>>> No.
>>>>
>>>> Post a thread dump Bryan and it might prompt something.
>>>>
>>>> St.Ack
>>>>
>>>>
>>>>
>>>>
>>>> >
>>>> > On Mon, Nov 30, 2015 at 4:13 PM Stack <stack@duboce.net> wrote:
>>>> >
>>>> > > On Mon, Nov 30, 2015 at 12:54 PM, Bryan Beaudreault <
>>>> > > bbeaudreault@hubspot.com> wrote:
>>>> > >
>>>> > > > Should this be added as a known issue in the CDH or hbase
>>>> > documentation?
>>>> > > It
>>>> > > > was a severe performance hit for us, all of our regionservers were
>>>> > > sitting
>>>> > > > at a few thousand queued requests.
>>>> > > >
>>>> > > >
>>>> > > Let me take care of that.
>>>> > > St.Ack
>>>> > >
>>>> > >
>>>> > >
>>>> > > > On Mon, Nov 30, 2015 at 3:53 PM Bryan Beaudreault <
>>>> > > > bbeaudreault@hubspot.com>
>>>> > > > wrote:
>>>> > > >
>>>> > > > > Yea, they are all over the place and called from client and
>>>> > coprocessor
>>>> > > > > code. We ended up having no other option but to rollback, and
>>>> aside
>>>> > > from
>>>> > > > a
>>>> > > > > few NoSuchMethodErrors due to API changes (Put#add vs
>>>> Put#addColumn),
>>>> > > it
>>>> > > > > seems to be working and fixing our problem.
>>>> > > > >
>>>> > > > > On Mon, Nov 30, 2015 at 3:47 PM Stack <stack@duboce.net> wrote:
>>>> > > > >
>>>> > > > >> Rollback is untested. No fix in 5.5. I was going to work on
>>>> this
>>>> > now.
>>>> > > > >> Where
>>>> > > > >> are your counters Bryan? In their own column family or
>>>> scattered
>>>> > about
>>>> > > > in
>>>> > > > >> a
>>>> > > > >> row with other Cell types?
>>>> > > > >> St.Ack
>>>> > > > >>
>>>> > > > >> On Mon, Nov 30, 2015 at 10:28 AM, Bryan Beaudreault <
>>>> > > > >> bbeaudreault@hubspot.com> wrote:
>>>> > > > >>
>>>> > > > >> > Is there any update to this? We just upgraded all of our
>>>> > production
>>>> > > > >> > clusters from CDH4 to CDH5.4.7 and, not seeing this JIRA
>>>> listed in
>>>> > > the
>>>> > > > >> > known issues, did not not about this.  Now we are seeing
>>>> > perfomance
>>>> > > > >> issues
>>>> > > > >> > across all clusters, as we make heavy use of increments.
>>>> > > > >> >
>>>> > > > >> > Can we roll forward to CDH5.5 to fix? Or is our only hope to
>>>> roll
>>>> > > back
>>>> > > > >> to
>>>> > > > >> > CDH 5.3.1 (if that is possible)?
>>>> > > > >> >
>>>> > > > >> >
>>>> > > > >> > On Thu, Sep 24, 2015 at 5:06 AM 鈴木俊裕 <brfrn169@gmail.com>
>>>> wrote:
>>>> > > > >> >
>>>> > > > >> > > Thank you St.Ack!
>>>> > > > >> > >
>>>> > > > >> > > I would like to follow the ticket.
>>>> > > > >> > >
>>>> > > > >> > > Toshihiro Suzuki
>>>> > > > >> > >
>>>> > > > >> > > 2015-09-22 14:14 GMT+09:00 Stack <stack@duboce.net>:
>>>> > > > >> > >
>>>> > > > >> > > > Back to this problem. Simple tests confirm that as is,
>>>> the
>>>> > > > >> > > > single-queue-backed MVCC instance can slow Region ops if
>>>> some
>>>> > > > other
>>>> > > > >> row
>>>> > > > >> > > is
>>>> > > > >> > > > slow to complete. In particular Increment, checkAndPut,
>>>> and
>>>> > > batch
>>>> > > > >> > > mutations
>>>> > > > >> > > > are effected. I opened HBASE-14460 to start in on a fix
>>>> up.
>>>> > Lets
>>>> > > > >> see if
>>>> > > > >> > > we
>>>> > > > >> > > > can somehow scope mvcc to row or at least shard mvcc so
>>>> not
>>>> > all
>>>> > > > >> Region
>>>> > > > >> > > ops
>>>> > > > >> > > > are paused.
>>>> > > > >> > > >
>>>> > > > >> > > > St.Ack
>>>> > > > >> > > >
>>>> > > > >> > > >
>>>> > > > >> > > > On Mon, Sep 14, 2015 at 4:15 AM, 鈴木俊裕 <
>>>> brfrn169@gmail.com>
>>>> > > wrote:
>>>> > > > >> > > >
>>>> > > > >> > > > > > Thank you for the below reasoning (with accompanying
>>>> > helpful
>>>> > > > >> > > diagram).
>>>> > > > >> > > > > > Makes sense. Let me hack up a test case to help with
>>>> the
>>>> > > > >> > > illustration.
>>>> > > > >> > > > It
>>>> > > > >> > > > > > is as though the mvcc should be scoped to a row
>>>> only...
>>>> > > Writes
>>>> > > > >> > > against
>>>> > > > >> > > > > > other rows should not hold up my read of my row. Tag
>>>> an
>>>> > mvcc
>>>> > > > >> with a
>>>> > > > >> > > > 'row'
>>>> > > > >> > > > > > scope so we can see which on-going writes pertain to
>>>> > current
>>>> > > > >> > > operation?
>>>> > > > >> > > > > Thank you St.Ack! I think this approach would work.
>>>> > > > >> > > > >
>>>> > > > >> > > > > > You need to read back the increment and have it be
>>>> > 'correct'
>>>> > > > at
>>>> > > > >> > > > increment
>>>> > > > >> > > > > > time?
>>>> > > > >> > > > > Yes, we need it.
>>>> > > > >> > > > >
>>>> > > > >> > > > > I would like to help if there is anything I can do.
>>>> > > > >> > > > >
>>>> > > > >> > > > > Thanks,
>>>> > > > >> > > > > Toshihiro Suzuki
>>>> > > > >> > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > > > 2015-09-13 14:11 GMT+09:00 Stack <stack@duboce.net>:
>>>> > > > >> > > > >
>>>> > > > >> > > > > > Thank you for the below reasoning (with accompanying
>>>> > helpful
>>>> > > > >> > > diagram).
>>>> > > > >> > > > > > Makes sense. Let me hack up a test case to help with
>>>> the
>>>> > > > >> > > illustration.
>>>> > > > >> > > > It
>>>> > > > >> > > > > > is as though the mvcc should be scoped to a row
>>>> only...
>>>> > > Writes
>>>> > > > >> > > against
>>>> > > > >> > > > > > other rows should not hold up my read of my row. Tag
>>>> an
>>>> > mvcc
>>>> > > > >> with a
>>>> > > > >> > > > 'row'
>>>> > > > >> > > > > > scope so we can see which on-going writes pertain to
>>>> > current
>>>> > > > >> > > operation?
>>>> > > > >> > > > > >
>>>> > > > >> > > > > > You need to read back the increment and have it be
>>>> > 'correct'
>>>> > > > at
>>>> > > > >> > > > increment
>>>> > > > >> > > > > > time?
>>>> > > > >> > > > > >
>>>> > > > >> > > > > > (This is a good one)
>>>> > > > >> > > > > >
>>>> > > > >> > > > > > Thank you Toshihiro Suzuki
>>>> > > > >> > > > > > St.Ack
>>>> > > > >> > > > > >
>>>> > > > >> > > > > > On Sat, Sep 12, 2015 at 8:09 AM, 鈴木俊裕 <
>>>> brfrn169@gmail.com
>>>> > >
>>>> > > > >> wrote:
>>>> > > > >> > > > > >
>>>> > > > >> > > > > > > St.Ack,
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Thank you for your response.
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Why I make out that "A region lock (not a row lock)
>>>> > seems
>>>> > > to
>>>> > > > >> > occur
>>>> > > > >> > > in
>>>> > > > >> > > > > > > waitForPreviousTransactionsComplete()" is as
>>>> follows:
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > A increment operation has 3 procedures for MVCC.
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > 1. mvcc.waitForPreviousTransactionsComplete();
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6712
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > 2. w = mvcc.beginMemstoreInsertWithSeqNum(mvccNum);
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6721
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > 3. mvcc.completeMemstoreInsertWithSeqNum(w,
>>>> walKey);
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6893
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > I think that MultiVersionConsistencyControl's
>>>> writeQueue
>>>> > > can
>>>> > > > >> > cause
>>>> > > > >> > > a
>>>> > > > >> > > > > > region
>>>> > > > >> > > > > > > lock.
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L42-L43
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Step 2 adds to a WriteEntry to writeQueue.
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L102-L108
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Step 3 removes the WriteEntry from writeQueue.
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > completeMemstoreInsertWithSeqNum(w, walKey) ->
>>>> > > > >> > > > > > > waitForPreviousTransactionsComplete(e) ->
>>>> > > advanceMemstore(w)
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L127
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L235
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L160
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Step 1 adds a WriteEntry w in
>>>> beginMemstoreInsert() to
>>>> > > > >> writeQueue
>>>> > > > >> > > and
>>>> > > > >> > > > > > waits
>>>> > > > >> > > > > > > until writeQueue is empty or writeQueue.getFirst()
>>>> == w.
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L201-L204
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L206-L241
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > I think when a handler thread is processing between
>>>> > step 2
>>>> > > > and
>>>> > > > >> > step
>>>> > > > >> > > > 3,
>>>> > > > >> > > > > > the
>>>> > > > >> > > > > > > other handler threads can wait at step 1 until the
>>>> > thread
>>>> > > > >> > completes
>>>> > > > >> > > > > step
>>>> > > > >> > > > > > 3
>>>> > > > >> > > > > > > This is depicted as follows:
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://gist.githubusercontent.com/brfrn169/cb4f2c157129330cd932/raw/86d6aae5667b0fe006b16fed80f1b0c4945c7fd0/region_lock.png
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Actually, in the thread dump of our region server,
>>>> many
>>>> > > > >> handler
>>>> > > > >> > > > threads
>>>> > > > >> > > > > > > (RW.default.writeRpcServer.handler) wait at Step 1
>>>> > > > >> > > > > > > (waitForPreviousTransactionsComplete()).
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://gist.githubusercontent.com/brfrn169/cb4f2c157129330cd932/raw/86d6aae5667b0fe006b16fed80f1b0c4945c7fd0/thread_dump.txt
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Many handler threads wait at this:
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyControl.java#L224
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > > Is it possible you are contending on a counter
>>>> > > > post-upgrade?
>>>> > > > >> > Is
>>>> > > > >> > > it
>>>> > > > >> > > > > > > > possible that all these threads are trying to
>>>> get to
>>>> > the
>>>> > > > >> same
>>>> > > > >> > row
>>>> > > > >> > > > to
>>>> > > > >> > > > > > > update
>>>> > > > >> > > > > > > > it? Could the app behavior have changed?  Or are
>>>> you
>>>> > > > >> thinking
>>>> > > > >> > > > > increment
>>>> > > > >> > > > > > > > itself has slowed significantly?
>>>> > > > >> > > > > > > We have just upgraded HBase, not changed the app
>>>> > behavior.
>>>> > > > We
>>>> > > > >> are
>>>> > > > >> > > > > > thinking
>>>> > > > >> > > > > > > increment itself has slowed significantly.
>>>> > > > >> > > > > > > Before upgrading HBase, it was good throughput and
>>>> > > latency.
>>>> > > > >> > > > > > > Currently, to cope with this problem, we split the
>>>> > regions
>>>> > > > >> > finely.
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Thanks,
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > Toshihiro Suzuki
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > 2015-09-09 15:29 GMT+09:00 Stack <stack@duboce.net
>>>> >:
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > > > > On Tue, Sep 8, 2015 at 10:22 PM, 鈴木俊裕 <
>>>> > > brfrn169@gmail.com
>>>> > > > >
>>>> > > > >> > > wrote:
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > > Ted,
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > Thank you for your response.
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > I uploaded the complete stack trace to Gist.
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > https://gist.github.com/brfrn169/cb4f2c157129330cd932
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > I think that increment operation works as
>>>> follows:
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > 1. get row lock
>>>> > > > >> > > > > > > > > 2. mvcc.waitForPreviousTransactionsComplete()
>>>> //
>>>> > wait
>>>> > > > for
>>>> > > > >> all
>>>> > > > >> > > > prior
>>>> > > > >> > > > > > > MVCC
>>>> > > > >> > > > > > > > > transactions to finish
>>>> > > > >> > > > > > > > > 3. mvcc.beginMemstoreInsertWithSeqNum() //
>>>> start a
>>>> > > > >> > transaction
>>>> > > > >> > > > > > > > > 4. get previous values
>>>> > > > >> > > > > > > > > 5. create KVs
>>>> > > > >> > > > > > > > > 6. write to Memstore
>>>> > > > >> > > > > > > > > 7. write to WAL
>>>> > > > >> > > > > > > > > 8. release row lock
>>>> > > > >> > > > > > > > > 9. mvcc.completeMemstoreInsertWithSeqNum() //
>>>> > complete
>>>> > > > the
>>>> > > > >> > > > > > transaction
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > A instance of MultiVersionConsistencyControl
>>>> has a
>>>> > > > pending
>>>> > > > >> > > queue
>>>> > > > >> > > > of
>>>> > > > >> > > > > > > > writes
>>>> > > > >> > > > > > > > > named writeQueue.
>>>> > > > >> > > > > > > > > Step 2 puts a WriteEntry w to writeQueue and
>>>> waits
>>>> > > until
>>>> > > > >> > > > writeQueue
>>>> > > > >> > > > > > is
>>>> > > > >> > > > > > > > > empty or writeQueue.getFirst() == w.
>>>> > > > >> > > > > > > > > Step 3 puts a WriteEntry to writeQueue and
>>>> step 9
>>>> > > > removes
>>>> > > > >> the
>>>> > > > >> > > > > > > WriteEntry
>>>> > > > >> > > > > > > > > from writeQueue.
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > I think that when a handler thread is
>>>> processing
>>>> > > between
>>>> > > > >> > step 2
>>>> > > > >> > > > and
>>>> > > > >> > > > > > > step
>>>> > > > >> > > > > > > > 9,
>>>> > > > >> > > > > > > > > the other handler threads can wait until the
>>>> thread
>>>> > > > >> completes
>>>> > > > >> > > > step
>>>> > > > >> > > > > 9.
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > That is right. We need to read, after all
>>>> outstanding
>>>> > > > >> updates
>>>> > > > >> > are
>>>> > > > >> > > > > > done...
>>>> > > > >> > > > > > > > because we need to read the latest update before
>>>> we go
>>>> > > to
>>>> > > > >> > > > > > > modify/increment
>>>> > > > >> > > > > > > > it.
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > How do you make out this?
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > "A region lock (not a row lock) seems to occur in
>>>> > > > >> > > > > > > > waitForPreviousTransactionsComplete()."
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > In 0.98.x we did this:
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > ... and in 1.0 we do this:
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > mvcc.waitForPreviousTransactionsComplete() which
>>>> is
>>>> > > > this....
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > +  public void
>>>> waitForPreviousTransactionsComplete() {
>>>> > > > >> > > > > > > > +    WriteEntry w = beginMemstoreInsert();
>>>> > > > >> > > > > > > > +    waitForPreviousTransactionsComplete(w);
>>>> > > > >> > > > > > > > +  }
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > The mvcc and region sequenceid were merged in
>>>> 1.0 (
>>>> > > > >> > > > > > > > https://issues.apache.org/jira/browse/HBASE-8763
>>>> ).
>>>> > > > Previous
>>>> > > > >> > mvcc
>>>> > > > >> > > > and
>>>> > > > >> > > > > > > > region
>>>> > > > >> > > > > > > > sequenceid would spin independent of each other.
>>>> > Perhaps
>>>> > > > >> this
>>>> > > > >> > > > > > responsible
>>>> > > > >> > > > > > > > for some slow down.
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > That said, looking in your thread dump, we seem
>>>> to be
>>>> > > down
>>>> > > > >> in
>>>> > > > >> > the
>>>> > > > >> > > > > Get.
>>>> > > > >> > > > > > If
>>>> > > > >> > > > > > > > you do a bunch of thread dumps in a row, where
>>>> is the
>>>> > > > >> > > lock-holding
>>>> > > > >> > > > > > > thread?
>>>> > > > >> > > > > > > > In Get or writing Increment... or waiting on
>>>> sequence
>>>> > > id?
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > Is it possible you are contending on a counter
>>>> > > > post-upgrade?
>>>> > > > >> > Is
>>>> > > > >> > > it
>>>> > > > >> > > > > > > > possible that all these threads are trying to
>>>> get to
>>>> > the
>>>> > > > >> same
>>>> > > > >> > row
>>>> > > > >> > > > to
>>>> > > > >> > > > > > > update
>>>> > > > >> > > > > > > > it? Could the app behavior have changed?  Or are
>>>> you
>>>> > > > >> thinking
>>>> > > > >> > > > > increment
>>>> > > > >> > > > > > > > itself has slowed significantly?
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > St.Ack
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > > > > Thanks,
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > Toshihiro Suzuki
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > 2015-09-09 0:05 GMT+09:00 Ted Yu <
>>>> > yuzhihong@gmail.com
>>>> > > >:
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > > > > In HRegion#increment(), we lock the row (not
>>>> > > region):
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > > >     try {
>>>> > > > >> > > > > > > > > >       rowLock = getRowLock(row);
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > > > Can you pastebin the complete stack trace ?
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > > > Thanks
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > > > On Tue, Sep 8, 2015 at 2:01 AM, 鈴木俊裕 <
>>>> > > > >> brfrn169@gmail.com>
>>>> > > > >> > > > wrote:
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > > > > Hi,
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > We upgraded our cluster from
>>>> > CDH5.3.1(HBase0.98.6)
>>>> > > > to
>>>> > > > >> > > > > > > > > > CDH5.4.5(HBase1.0.0)
>>>> > > > >> > > > > > > > > > > and we experience slowdown in increment
>>>> > operation.
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > Here's an extract from thread dump of the
>>>> > > > >> RegionServer of
>>>> > > > >> > > our
>>>> > > > >> > > > > > > > cluster:
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > Thread 68
>>>> > > > >> > > > > > >
>>>> > (RW.default.writeRpcServer.handler=15,queue=5,port=60020):
>>>> > > > >> > > > > > > > > > >   State: BLOCKED
>>>> > > > >> > > > > > > > > > >   Blocked count: 21689888
>>>> > > > >> > > > > > > > > > >   Waited count: 39828360
>>>> > > > >> > > > > > > > > > >   Blocked on java.util.LinkedList@3474e4b2
>>>> > > > >> > > > > > > > > > >   Blocked by 63
>>>> > > > >> > > > > > > > >
>>>> > > > (RW.default.writeRpcServer.handler=10,queue=0,port=60020)
>>>> > > > >> > > > > > > > > > >   Stack:
>>>> > > > >> > > > > > > > > > >     java.lang.Object.wait(Native Method)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:224)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:203)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > >
>>>> > > > >>
>>>> > > >
>>>> >
>>>> org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:6712)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:501)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:570)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:1901)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31451)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > >
>>>> > > > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > >
>>>> > > > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > >
>>>> > >
>>>> >
>>>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > >
>>>> > > >
>>>> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>>>> > > > >> > > > > > > > > > >     java.lang.Thread.run(Thread.java:745)
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > There are many similar threads in the
>>>> thread
>>>> > dump.
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > I read the source code and I think this is
>>>> > caused
>>>> > > by
>>>> > > > >> > > changes
>>>> > > > >> > > > of
>>>> > > > >> > > > > > > > > > > MultiVersionConsistencyControl.
>>>> > > > >> > > > > > > > > > > A region lock (not a row lock) seems to
>>>> occur in
>>>> > > > >> > > > > > > > > > > waitForPreviousTransactionsComplete().
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > Also we wrote performance test code for
>>>> > increment
>>>> > > > >> > operation
>>>> > > > >> > > > > that
>>>> > > > >> > > > > > > > > included
>>>> > > > >> > > > > > > > > > > 100 threads and ran it in local mode.
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > The result is shown below:
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > CDH5.3.1(HBase0.98.6)
>>>> > > > >> > > > > > > > > > > Throughput(op/s): 12757, Latency(ms):
>>>> > > > >> 7.975072509210629
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > CDH5.4.5(HBase1.0.0)
>>>> > > > >> > > > > > > > > > > Throughput(op/s): 2027, Latency(ms):
>>>> > > > 49.11840157868772
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > Thanks,
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > > > Toshihiro Suzuki
>>>> > > > >> > > > > > > > > > >
>>>> > > > >> > > > > > > > > >
>>>> > > > >> > > > > > > > >
>>>> > > > >> > > > > > > >
>>>> > > > >> > > > > > >
>>>> > > > >> > > > > >
>>>> > > > >> > > > >
>>>> > > > >> > > >
>>>> > > > >> > >
>>>> > > > >> >
>>>> > > > >>
>>>> > > > >
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message