Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Fri, 7 Dec 2012 02:59:21 +0000 (UTC)
From: "Andrew Purtell (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12618620.1354563221481.1434.1354849161869@arcas>
In-Reply-To: <JIRA.12618620.1354563221481@arcas>
References: <JIRA.12618620.1354563221481@arcas>
Subject: [jira] [Commented] (HBASE-7263) Investigate more fine grained
 locking for checkAndPut/append/increment
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/HBASE-7263?page=3Dcom.atlassian=
.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1352=
6100#comment-13526100 ]=20

Andrew Purtell commented on HBASE-7263:
---------------------------------------

I like this thinking.=C2=A0

Now is the time (0.96 "singularity") to make changes like dropping user row=
 locks.=C2=A0

We need to manage against having too many major changes destabilizing code =
at once. That said, I would vote for this one because removing or mitigatin=
g contention points will be increasingly important as some storage shifts a=
way from spinning media.=C2=A0
               =20
> Investigate more fine grained locking for checkAndPut/append/increment
> ----------------------------------------------------------------------
>
>                 Key: HBASE-7263
>                 URL: https://issues.apache.org/jira/browse/HBASE-7263
>             Project: HBase
>          Issue Type: Improvement
>          Components: Transactions/MVCC
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that=
 you have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessar=
y contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an=
 increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentione=
d, is that you have to wait for updates on other rows, since MVCC is per-ro=
w.
> Another option occurred to me that I think is worth investigating: rely o=
n a row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndP=
ut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it=
 until the MVCC is completed.  The nice property that this gives you is tha=
t read/updates can tell when the MVCC is done on a per-row basis, because t=
hey can just try to acquire the write-lock which will block until the MVCC =
is competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but=
 it should be small, since there will never be any blocking on acquiring th=
e row-level read lock.  This is because the read lock can only block if som=
eone else holds the write lock, but both the write and read lock are only a=
cquired under the row lock.
> I ran a quick test of this approach over a region (this directly interact=
s with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I=
 need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrato=
rs
For more information on JIRA, see: http://www.atlassian.com/software/jira