Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
MIME-Version: 1.0
References: 
 <CANGerjyO+k+cPSKvOqxF7qVK9wZVSNm9JWDnd4q8D11y3mfROw@mail.gmail.com>
 <CALte62w++kBMNTZGL-8a0CG3gB2=d2gz6mghrkbOqdjOTcYNEg@mail.gmail.com>
 <CANGerjwgs8qrEoV=ySPNVybBk-4TzvpNKnX+9Yuiiids-Edxug@mail.gmail.com>
 <CADcMMgEbQQz3xgF=sxr-X5QBen3DkxyLuBsmc_ekhAn-XKEsuw@mail.gmail.com>
 <CANGerjwWBD80_8S3uBnb47PYrqXfTaU2e8Q1HY1dzjd9MLEcGg@mail.gmail.com>
 <CADcMMgEDKyxR15twf1Fm4vaF5iFg=NgNz9Q9XO_L_8g4EtkOOA@mail.gmail.com>
 <CANGerjy_dqPoavFbyNPb9c59gxjq__BOoSeYpOuev887d1o+sw@mail.gmail.com>
 <CADcMMgFQ7EoyLpx4vq7Ee=KzH2yQPYLUpeA80S933aiqs4gYqg@mail.gmail.com>
 <CANGerjw5ahhVfenQjx4DQQUnk1RsnYP-d7AeSn3Ze555B5whbA@mail.gmail.com>
 <CANZDn9u14BgO7QzB1rWCA2MHoXL2L_mY9EwNgX_QagX-hdsNBQ@mail.gmail.com>
 <CADcMMgFWDMzv1Q8ixMC0trFEmF=5Yr_QJD8tVPhPppwO0mBbnw@mail.gmail.com>
 <CANZDn9s4bHt-M2_iE4ap4eUzFsycvwnBf=VMyqWXFMzwUEhVfw@mail.gmail.com>
 <CANZDn9v_U_CjfGUNunOAmmup_V3oJNj-kkBDWMFMS2BGSqfZsA@mail.gmail.com>
 <CADcMMgHJSZ_QPFPdKKQzvnj6kkc7xZpiTUKtk9nu9LUxq5=5Tg@mail.gmail.com>
 <CANZDn9sY1y7ny1tqKe0on0t5rjhTJ+SJY=-BMcqjFaFgv82Wbw@mail.gmail.com>
 <CADcMMgGrkK+=q_Pi5VW9rDsUDmp5ZyxmjgDk9KY6UyfxO8eEMA@mail.gmail.com>
 <CANZDn9vLCftVSx0MhC1Bk4ZkS0QKggffH9SZkBK2mUeF7KrTaA@mail.gmail.com>
In-Reply-To: 
 <CANZDn9vLCftVSx0MhC1Bk4ZkS0QKggffH9SZkBK2mUeF7KrTaA@mail.gmail.com>
From: Bryan Beaudreault <bbeaudreault@hubspot.com>
Date: Mon, 30 Nov 2015 23:11:49 +0000
Message-ID: 
 <CANZDn9t1SCeWwS0cheGhrsMrJrLrpvMMxm0MQV5q3r=rKwuc9Q@mail.gmail.com>
Subject: Re: Performance degradation between CDH5.3.1(HBase0.98.6) and
 CDH5.4.5(HBase1.0.0)
To: HBase Dev List <dev@hbase.apache.org>
Cc: Hbase-User <user@hbase.apache.org>
Content-Type: multipart/alternative; boundary=047d7b33d418fe0c8e0525ca2a53

--047d7b33d418fe0c8e0525ca2a53
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Sorry the second link should be
https://gist.github.com/bbeaudreault/2994a748da83d9f75085#file-gistfile1-tx=
t-L579

On Mon, Nov 30, 2015 at 6:10 PM Bryan Beaudreault <bbeaudreault@hubspot.com=
>
wrote:

> https://gist.github.com/bbeaudreault/2994a748da83d9f75085
>
> An active handler:
> https://gist.github.com/bbeaudreault/2994a748da83d9f75085#file-gistfile1-=
txt-L286
> One that is locked:
> https://git.hubteam.com/gist/jwilliams/80f37999bfdf55119588#file-gistfile=
1-txt-L579
>
> The difference between pre-rollback and post is that previously we were
> seeing things blocked in mvcc.  Now we are seeing them blocked on the
> upsert.
>
> It always follows the same pattern, of 1 active handler in the upsert and
> the rest blocked waiting for it.
>
> On Mon, Nov 30, 2015 at 6:05 PM Stack <stack@duboce.net> wrote:
>
>> On Mon, Nov 30, 2015 at 2:31 PM, Bryan Beaudreault <
>> bbeaudreault@hubspot.com
>> > wrote:
>>
>> > The rollback seems to have mostly solved the issue for one of our
>> clusters,
>> > but another one is still seeing long increment times:
>> >
>> > "slowIncrementCount": 52080,
>> > "Increment_num_ops": 325236,"Increment_min": 1,"Increment_max": 6162,"
>> > Increment_mean": 465.68678129112396,"Increment_median": 216,"
>> > Increment_75th_percentile": 450.25,"Increment_95th_percentile":
>> > 1052.6499999999999,"Increment_99th_percentile": 1635.2399999999998
>> >
>> >
>> > Any ideas if there are other changes that may be causing a performance
>> > regression for increments between CDH4.7.1 and CDH5.3.8?
>> >
>> >
>> >
>> No.
>>
>> Post a thread dump Bryan and it might prompt something.
>>
>> St.Ack
>>
>>
>>
>>
>> >
>> > On Mon, Nov 30, 2015 at 4:13 PM Stack <stack@duboce.net> wrote:
>> >
>> > > On Mon, Nov 30, 2015 at 12:54 PM, Bryan Beaudreault <
>> > > bbeaudreault@hubspot.com> wrote:
>> > >
>> > > > Should this be added as a known issue in the CDH or hbase
>> > documentation?
>> > > It
>> > > > was a severe performance hit for us, all of our regionservers were
>> > > sitting
>> > > > at a few thousand queued requests.
>> > > >
>> > > >
>> > > Let me take care of that.
>> > > St.Ack
>> > >
>> > >
>> > >
>> > > > On Mon, Nov 30, 2015 at 3:53 PM Bryan Beaudreault <
>> > > > bbeaudreault@hubspot.com>
>> > > > wrote:
>> > > >
>> > > > > Yea, they are all over the place and called from client and
>> > coprocessor
>> > > > > code. We ended up having no other option but to rollback, and
>> aside
>> > > from
>> > > > a
>> > > > > few NoSuchMethodErrors due to API changes (Put#add vs
>> Put#addColumn),
>> > > it
>> > > > > seems to be working and fixing our problem.
>> > > > >
>> > > > > On Mon, Nov 30, 2015 at 3:47 PM Stack <stack@duboce.net> wrote:
>> > > > >
>> > > > >> Rollback is untested. No fix in 5.5. I was going to work on thi=
s
>> > now.
>> > > > >> Where
>> > > > >> are your counters Bryan? In their own column family or scattere=
d
>> > about
>> > > > in
>> > > > >> a
>> > > > >> row with other Cell types?
>> > > > >> St.Ack
>> > > > >>
>> > > > >> On Mon, Nov 30, 2015 at 10:28 AM, Bryan Beaudreault <
>> > > > >> bbeaudreault@hubspot.com> wrote:
>> > > > >>
>> > > > >> > Is there any update to this? We just upgraded all of our
>> > production
>> > > > >> > clusters from CDH4 to CDH5.4.7 and, not seeing this JIRA
>> listed in
>> > > the
>> > > > >> > known issues, did not not about this.  Now we are seeing
>> > perfomance
>> > > > >> issues
>> > > > >> > across all clusters, as we make heavy use of increments.
>> > > > >> >
>> > > > >> > Can we roll forward to CDH5.5 to fix? Or is our only hope to
>> roll
>> > > back
>> > > > >> to
>> > > > >> > CDH 5.3.1 (if that is possible)?
>> > > > >> >
>> > > > >> >
>> > > > >> > On Thu, Sep 24, 2015 at 5:06 AM =E9=88=B4=E6=9C=A8=E4=BF=8A=
=E8=A3=95 <brfrn169@gmail.com>
>> wrote:
>> > > > >> >
>> > > > >> > > Thank you St.Ack!
>> > > > >> > >
>> > > > >> > > I would like to follow the ticket.
>> > > > >> > >
>> > > > >> > > Toshihiro Suzuki
>> > > > >> > >
>> > > > >> > > 2015-09-22 14:14 GMT+09:00 Stack <stack@duboce.net>:
>> > > > >> > >
>> > > > >> > > > Back to this problem. Simple tests confirm that as is, th=
e
>> > > > >> > > > single-queue-backed MVCC instance can slow Region ops if
>> some
>> > > > other
>> > > > >> row
>> > > > >> > > is
>> > > > >> > > > slow to complete. In particular Increment, checkAndPut, a=
nd
>> > > batch
>> > > > >> > > mutations
>> > > > >> > > > are effected. I opened HBASE-14460 to start in on a fix u=
p.
>> > Lets
>> > > > >> see if
>> > > > >> > > we
>> > > > >> > > > can somehow scope mvcc to row or at least shard mvcc so n=
ot
>> > all
>> > > > >> Region
>> > > > >> > > ops
>> > > > >> > > > are paused.
>> > > > >> > > >
>> > > > >> > > > St.Ack
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > > On Mon, Sep 14, 2015 at 4:15 AM, =E9=88=B4=E6=9C=A8=E4=BF=
=8A=E8=A3=95 <brfrn169@gmail.com>
>> > > wrote:
>> > > > >> > > >
>> > > > >> > > > > > Thank you for the below reasoning (with accompanying
>> > helpful
>> > > > >> > > diagram).
>> > > > >> > > > > > Makes sense. Let me hack up a test case to help with
>> the
>> > > > >> > > illustration.
>> > > > >> > > > It
>> > > > >> > > > > > is as though the mvcc should be scoped to a row only.=
..
>> > > Writes
>> > > > >> > > against
>> > > > >> > > > > > other rows should not hold up my read of my row. Tag =
an
>> > mvcc
>> > > > >> with a
>> > > > >> > > > 'row'
>> > > > >> > > > > > scope so we can see which on-going writes pertain to
>> > current
>> > > > >> > > operation?
>> > > > >> > > > > Thank you St.Ack! I think this approach would work.
>> > > > >> > > > >
>> > > > >> > > > > > You need to read back the increment and have it be
>> > 'correct'
>> > > > at
>> > > > >> > > > increment
>> > > > >> > > > > > time?
>> > > > >> > > > > Yes, we need it.
>> > > > >> > > > >
>> > > > >> > > > > I would like to help if there is anything I can do.
>> > > > >> > > > >
>> > > > >> > > > > Thanks,
>> > > > >> > > > > Toshihiro Suzuki
>> > > > >> > > > >
>> > > > >> > > > >
>> > > > >> > > > > 2015-09-13 14:11 GMT+09:00 Stack <stack@duboce.net>:
>> > > > >> > > > >
>> > > > >> > > > > > Thank you for the below reasoning (with accompanying
>> > helpful
>> > > > >> > > diagram).
>> > > > >> > > > > > Makes sense. Let me hack up a test case to help with
>> the
>> > > > >> > > illustration.
>> > > > >> > > > It
>> > > > >> > > > > > is as though the mvcc should be scoped to a row only.=
..
>> > > Writes
>> > > > >> > > against
>> > > > >> > > > > > other rows should not hold up my read of my row. Tag =
an
>> > mvcc
>> > > > >> with a
>> > > > >> > > > 'row'
>> > > > >> > > > > > scope so we can see which on-going writes pertain to
>> > current
>> > > > >> > > operation?
>> > > > >> > > > > >
>> > > > >> > > > > > You need to read back the increment and have it be
>> > 'correct'
>> > > > at
>> > > > >> > > > increment
>> > > > >> > > > > > time?
>> > > > >> > > > > >
>> > > > >> > > > > > (This is a good one)
>> > > > >> > > > > >
>> > > > >> > > > > > Thank you Toshihiro Suzuki
>> > > > >> > > > > > St.Ack
>> > > > >> > > > > >
>> > > > >> > > > > > On Sat, Sep 12, 2015 at 8:09 AM, =E9=88=B4=E6=9C=A8=
=E4=BF=8A=E8=A3=95 <
>> brfrn169@gmail.com
>> > >
>> > > > >> wrote:
>> > > > >> > > > > >
>> > > > >> > > > > > > St.Ack,
>> > > > >> > > > > > >
>> > > > >> > > > > > > Thank you for your response.
>> > > > >> > > > > > >
>> > > > >> > > > > > > Why I make out that "A region lock (not a row lock)
>> > seems
>> > > to
>> > > > >> > occur
>> > > > >> > > in
>> > > > >> > > > > > > waitForPreviousTransactionsComplete()" is as follow=
s:
>> > > > >> > > > > > >
>> > > > >> > > > > > > A increment operation has 3 procedures for MVCC.
>> > > > >> > > > > > >
>> > > > >> > > > > > > 1. mvcc.waitForPreviousTransactionsComplete();
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6712
>> > > > >> > > > > > >
>> > > > >> > > > > > > 2. w =3D mvcc.beginMemstoreInsertWithSeqNum(mvccNum=
);
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6721
>> > > > >> > > > > > >
>> > > > >> > > > > > > 3. mvcc.completeMemstoreInsertWithSeqNum(w, walKey)=
;
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L6893
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > I think that MultiVersionConsistencyControl's
>> writeQueue
>> > > can
>> > > > >> > cause
>> > > > >> > > a
>> > > > >> > > > > > region
>> > > > >> > > > > > > lock.
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L42-L43
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > Step 2 adds to a WriteEntry to writeQueue.
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L102-L108
>> > > > >> > > > > > >
>> > > > >> > > > > > > Step 3 removes the WriteEntry from writeQueue.
>> > > > >> > > > > > >
>> > > > >> > > > > > > completeMemstoreInsertWithSeqNum(w, walKey) ->
>> > > > >> > > > > > > waitForPreviousTransactionsComplete(e) ->
>> > > advanceMemstore(w)
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L127
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L235
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L160
>> > > > >> > > > > > >
>> > > > >> > > > > > > Step 1 adds a WriteEntry w in beginMemstoreInsert()
>> to
>> > > > >> writeQueue
>> > > > >> > > and
>> > > > >> > > > > > waits
>> > > > >> > > > > > > until writeQueue is empty or writeQueue.getFirst()
>> =3D=3D w.
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L201-L204
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L206-L241
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > I think when a handler thread is processing between
>> > step 2
>> > > > and
>> > > > >> > step
>> > > > >> > > > 3,
>> > > > >> > > > > > the
>> > > > >> > > > > > > other handler threads can wait at step 1 until the
>> > thread
>> > > > >> > completes
>> > > > >> > > > > step
>> > > > >> > > > > > 3
>> > > > >> > > > > > > This is depicted as follows:
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://gist.githubusercontent.com/brfrn169/cb4f2c157129330cd932/raw/86d=
6aae5667b0fe006b16fed80f1b0c4945c7fd0/region_lock.png
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > Actually, in the thread dump of our region server,
>> many
>> > > > >> handler
>> > > > >> > > > threads
>> > > > >> > > > > > > (RW.default.writeRpcServer.handler) wait at Step 1
>> > > > >> > > > > > > (waitForPreviousTransactionsComplete()).
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://gist.githubusercontent.com/brfrn169/cb4f2c157129330cd932/raw/86d=
6aae5667b0fe006b16fed80f1b0c4945c7fd0/thread_dump.txt
>> > > > >> > > > > > >
>> > > > >> > > > > > > Many handler threads wait at this:
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> https://github.com/cloudera/hbase/blob/cdh5.4.5-release/hbase-server/src=
/main/java/org/apache/hadoop/hbase/regionserver/MultiVersionConsistencyCont=
rol.java#L224
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > > Is it possible you are contending on a counter
>> > > > post-upgrade?
>> > > > >> > Is
>> > > > >> > > it
>> > > > >> > > > > > > > possible that all these threads are trying to get
>> to
>> > the
>> > > > >> same
>> > > > >> > row
>> > > > >> > > > to
>> > > > >> > > > > > > update
>> > > > >> > > > > > > > it? Could the app behavior have changed?  Or are
>> you
>> > > > >> thinking
>> > > > >> > > > > increment
>> > > > >> > > > > > > > itself has slowed significantly?
>> > > > >> > > > > > > We have just upgraded HBase, not changed the app
>> > behavior.
>> > > > We
>> > > > >> are
>> > > > >> > > > > > thinking
>> > > > >> > > > > > > increment itself has slowed significantly.
>> > > > >> > > > > > > Before upgrading HBase, it was good throughput and
>> > > latency.
>> > > > >> > > > > > > Currently, to cope with this problem, we split the
>> > regions
>> > > > >> > finely.
>> > > > >> > > > > > >
>> > > > >> > > > > > > Thanks,
>> > > > >> > > > > > >
>> > > > >> > > > > > > Toshihiro Suzuki
>> > > > >> > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > > > 2015-09-09 15:29 GMT+09:00 Stack <stack@duboce.net>=
:
>> > > > >> > > > > > >
>> > > > >> > > > > > > > On Tue, Sep 8, 2015 at 10:22 PM, =E9=88=B4=E6=9C=
=A8=E4=BF=8A=E8=A3=95 <
>> > > brfrn169@gmail.com
>> > > > >
>> > > > >> > > wrote:
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > > Ted,
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > Thank you for your response.
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > I uploaded the complete stack trace to Gist.
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > >
>> > https://gist.github.com/brfrn169/cb4f2c157129330cd932
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > I think that increment operation works as
>> follows:
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > 1. get row lock
>> > > > >> > > > > > > > > 2. mvcc.waitForPreviousTransactionsComplete() /=
/
>> > wait
>> > > > for
>> > > > >> all
>> > > > >> > > > prior
>> > > > >> > > > > > > MVCC
>> > > > >> > > > > > > > > transactions to finish
>> > > > >> > > > > > > > > 3. mvcc.beginMemstoreInsertWithSeqNum() // star=
t
>> a
>> > > > >> > transaction
>> > > > >> > > > > > > > > 4. get previous values
>> > > > >> > > > > > > > > 5. create KVs
>> > > > >> > > > > > > > > 6. write to Memstore
>> > > > >> > > > > > > > > 7. write to WAL
>> > > > >> > > > > > > > > 8. release row lock
>> > > > >> > > > > > > > > 9. mvcc.completeMemstoreInsertWithSeqNum() //
>> > complete
>> > > > the
>> > > > >> > > > > > transaction
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > A instance of MultiVersionConsistencyControl ha=
s
>> a
>> > > > pending
>> > > > >> > > queue
>> > > > >> > > > of
>> > > > >> > > > > > > > writes
>> > > > >> > > > > > > > > named writeQueue.
>> > > > >> > > > > > > > > Step 2 puts a WriteEntry w to writeQueue and
>> waits
>> > > until
>> > > > >> > > > writeQueue
>> > > > >> > > > > > is
>> > > > >> > > > > > > > > empty or writeQueue.getFirst() =3D=3D w.
>> > > > >> > > > > > > > > Step 3 puts a WriteEntry to writeQueue and step=
 9
>> > > > removes
>> > > > >> the
>> > > > >> > > > > > > WriteEntry
>> > > > >> > > > > > > > > from writeQueue.
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > I think that when a handler thread is processin=
g
>> > > between
>> > > > >> > step 2
>> > > > >> > > > and
>> > > > >> > > > > > > step
>> > > > >> > > > > > > > 9,
>> > > > >> > > > > > > > > the other handler threads can wait until the
>> thread
>> > > > >> completes
>> > > > >> > > > step
>> > > > >> > > > > 9.
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > That is right. We need to read, after all
>> outstanding
>> > > > >> updates
>> > > > >> > are
>> > > > >> > > > > > done...
>> > > > >> > > > > > > > because we need to read the latest update before
>> we go
>> > > to
>> > > > >> > > > > > > modify/increment
>> > > > >> > > > > > > > it.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > How do you make out this?
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > "A region lock (not a row lock) seems to occur in
>> > > > >> > > > > > > > waitForPreviousTransactionsComplete()."
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > In 0.98.x we did this:
>> > > > >> > > > > > > >
>> > > > >> > > > > > > >
>> > mvcc.completeMemstoreInsert(mvcc.beginMemstoreInsert());
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > ... and in 1.0 we do this:
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > mvcc.waitForPreviousTransactionsComplete() which =
is
>> > > > this....
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > +  public void
>> waitForPreviousTransactionsComplete() {
>> > > > >> > > > > > > > +    WriteEntry w =3D beginMemstoreInsert();
>> > > > >> > > > > > > > +    waitForPreviousTransactionsComplete(w);
>> > > > >> > > > > > > > +  }
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > The mvcc and region sequenceid were merged in 1.0=
 (
>> > > > >> > > > > > > > https://issues.apache.org/jira/browse/HBASE-8763)=
.
>> > > > Previous
>> > > > >> > mvcc
>> > > > >> > > > and
>> > > > >> > > > > > > > region
>> > > > >> > > > > > > > sequenceid would spin independent of each other.
>> > Perhaps
>> > > > >> this
>> > > > >> > > > > > responsible
>> > > > >> > > > > > > > for some slow down.
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > That said, looking in your thread dump, we seem t=
o
>> be
>> > > down
>> > > > >> in
>> > > > >> > the
>> > > > >> > > > > Get.
>> > > > >> > > > > > If
>> > > > >> > > > > > > > you do a bunch of thread dumps in a row, where is
>> the
>> > > > >> > > lock-holding
>> > > > >> > > > > > > thread?
>> > > > >> > > > > > > > In Get or writing Increment... or waiting on
>> sequence
>> > > id?
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > Is it possible you are contending on a counter
>> > > > post-upgrade?
>> > > > >> > Is
>> > > > >> > > it
>> > > > >> > > > > > > > possible that all these threads are trying to get
>> to
>> > the
>> > > > >> same
>> > > > >> > row
>> > > > >> > > > to
>> > > > >> > > > > > > update
>> > > > >> > > > > > > > it? Could the app behavior have changed?  Or are
>> you
>> > > > >> thinking
>> > > > >> > > > > increment
>> > > > >> > > > > > > > itself has slowed significantly?
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > St.Ack
>> > > > >> > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > > > > Thanks,
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > Toshihiro Suzuki
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > 2015-09-09 0:05 GMT+09:00 Ted Yu <
>> > yuzhihong@gmail.com
>> > > >:
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > > > > In HRegion#increment(), we lock the row (not
>> > > region):
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > > >     try {
>> > > > >> > > > > > > > > >       rowLock =3D getRowLock(row);
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > > > Can you pastebin the complete stack trace ?
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > > > Thanks
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > > > On Tue, Sep 8, 2015 at 2:01 AM, =E9=88=B4=E6=
=9C=A8=E4=BF=8A=E8=A3=95 <
>> > > > >> brfrn169@gmail.com>
>> > > > >> > > > wrote:
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > > > > Hi,
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > We upgraded our cluster from
>> > CDH5.3.1(HBase0.98.6)
>> > > > to
>> > > > >> > > > > > > > > > CDH5.4.5(HBase1.0.0)
>> > > > >> > > > > > > > > > > and we experience slowdown in increment
>> > operation.
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > Here's an extract from thread dump of the
>> > > > >> RegionServer of
>> > > > >> > > our
>> > > > >> > > > > > > > cluster:
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > Thread 68
>> > > > >> > > > > > >
>> > (RW.default.writeRpcServer.handler=3D15,queue=3D5,port=3D60020):
>> > > > >> > > > > > > > > > >   State: BLOCKED
>> > > > >> > > > > > > > > > >   Blocked count: 21689888
>> > > > >> > > > > > > > > > >   Waited count: 39828360
>> > > > >> > > > > > > > > > >   Blocked on java.util.LinkedList@3474e4b2
>> > > > >> > > > > > > > > > >   Blocked by 63
>> > > > >> > > > > > > > >
>> > > > (RW.default.writeRpcServer.handler=3D10,queue=3D0,port=3D60020)
>> > > > >> > > > > > > > > > >   Stack:
>> > > > >> > > > > > > > > > >     java.lang.Object.wait(Native Method)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.wait=
ForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:224)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.wait=
ForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:203)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > >
>> > > > >> > >
>> > > > >>
>> > > >
>> >
>> org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:6712=
)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServic=
es.java:501)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMuta=
tion(RSRpcServices.java:570)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.j=
ava:1901)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.=
callBlockingMethod(ClientProtos.java:31451)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > >
>> > > > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > >
>> > > > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:13=
0)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > >
>> > > > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107=
)
>> > > > >> > > > > > > > > > >     java.lang.Thread.run(Thread.java:745)
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > There are many similar threads in the threa=
d
>> > dump.
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > I read the source code and I think this is
>> > caused
>> > > by
>> > > > >> > > changes
>> > > > >> > > > of
>> > > > >> > > > > > > > > > > MultiVersionConsistencyControl.
>> > > > >> > > > > > > > > > > A region lock (not a row lock) seems to
>> occur in
>> > > > >> > > > > > > > > > > waitForPreviousTransactionsComplete().
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > Also we wrote performance test code for
>> > increment
>> > > > >> > operation
>> > > > >> > > > > that
>> > > > >> > > > > > > > > included
>> > > > >> > > > > > > > > > > 100 threads and ran it in local mode.
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > The result is shown below:
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > CDH5.3.1(HBase0.98.6)
>> > > > >> > > > > > > > > > > Throughput(op/s): 12757, Latency(ms):
>> > > > >> 7.975072509210629
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > CDH5.4.5(HBase1.0.0)
>> > > > >> > > > > > > > > > > Throughput(op/s): 2027, Latency(ms):
>> > > > 49.11840157868772
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > Thanks,
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > > > Toshihiro Suzuki
>> > > > >> > > > > > > > > > >
>> > > > >> > > > > > > > > >
>> > > > >> > > > > > > > >
>> > > > >> > > > > > > >
>> > > > >> > > > > > >
>> > > > >> > > > > >
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>

--047d7b33d418fe0c8e0525ca2a53--