Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3C85DE39B for ; Fri, 7 Dec 2012 02:59:25 +0000 (UTC) Received: (qmail 51744 invoked by uid 500); 7 Dec 2012 02:59:24 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 51442 invoked by uid 500); 7 Dec 2012 02:59:23 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 51355 invoked by uid 99); 7 Dec 2012 02:59:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Dec 2012 02:59:21 +0000 Date: Fri, 7 Dec 2012 02:59:21 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-7263?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1352= 6100#comment-13526100 ]=20 Andrew Purtell commented on HBASE-7263: --------------------------------------- I like this thinking.=C2=A0 Now is the time (0.96 "singularity") to make changes like dropping user row= locks.=C2=A0 We need to manage against having too many major changes destabilizing code = at once. That said, I would vote for this one because removing or mitigatin= g contention points will be increasingly important as some storage shifts a= way from spinning media.=C2=A0 =20 > Investigate more fine grained locking for checkAndPut/append/increment > ---------------------------------------------------------------------- > > Key: HBASE-7263 > URL: https://issues.apache.org/jira/browse/HBASE-7263 > Project: HBase > Issue Type: Improvement > Components: Transactions/MVCC > Reporter: Gregory Chanan > Assignee: Gregory Chanan > Priority: Minor > > HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut: > {quote} > 1) Waiting for the MVCC to advance for read/updates: the downside is that= you have to wait for updates on other rows. > 2) Have an MVCC per-row (table configuration): this avoids the unnecessar= y contention of 1) > 3) Transform the read/updates to write-only with rollup on read.. E.g. an= increment would just have the number of values to increment. > {quote} > HBASE-7051 and HBASE-4583 implement option #1. The downside, as mentione= d, is that you have to wait for updates on other rows, since MVCC is per-ro= w. > Another option occurred to me that I think is worth investigating: rely o= n a row-level read/write lock rather than MVCC. > Here is pseudo-code for what exists today for read/updates like checkAndP= ut > {code} > (1) Acquire RowLock > (1a) BeginMVCC + Finish MVCC > (2) Begin MVCC > (3) Do work > (4) Release RowLock > (5) Append to WAL > (6) Finish MVCC > {code} > Write-only operations (e.g. puts) are the same, just without step 1a. > Now, consider the following instead: > {code} > (1) Acquire RowLock > (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC) > (1b) Grab RowReadLock (new step!) > (2) Begin MVCC > (3) Do work > (4) Release RowLock > (5) Append to WAL > (6) Finish MVCC > (7) Release RowReadLock (new step!) > {code} > As before, write-only operations are the same, just without step 1a. > The difference here is that writes grab a row-level read lock and hold it= until the MVCC is completed. The nice property that this gives you is tha= t read/updates can tell when the MVCC is done on a per-row basis, because t= hey can just try to acquire the write-lock which will block until the MVCC = is competed for that row in step 7. > There is overhead for acquiring the read lock that I need to measure, but= it should be small, since there will never be any blocking on acquiring th= e row-level read lock. This is because the read lock can only block if som= eone else holds the write lock, but both the write and read lock are only a= cquired under the row lock. > I ran a quick test of this approach over a region (this directly interact= s with HRegion, so no client effects): > - 30 threads > - 5000 increments per thread > - 30 columns per increment > - Each increment uniformly distributed over 500,000 rows > - 5 trials > Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms > Today: 13950 ms > The locking approach: 10877 ms > So it looks like an improvement, at least wrt increment. As mentioned, I= need to measure the overhead of acquiring the read lock for puts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira