hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: HBase reads, isolation levels and RegionScanner internal locking
Date Sun, 14 Sep 2014 05:10:54 GMT
Thanks Michael, for your "RDBMS school[ing]". (Did I mention I used to work at various RDBMS
companies before I came to HBase?)

Vladimir, to answer your question:
- HBase *always* locks a row for writes. Other writes to the same row will queue behind this
- READ_[UN]COMITTED here only refers the whether one can see the result of prior inflight
MVCC transactions. It does not affect the need for the per row write lock.
- The MVCC transactions in HBase are strictly serialize (which allows for a really simple
and elegant implementations, that is valid as long as each individual transaction is short)
- READ_UNCOMMITTED will allow a client to see a partially updated row. It has no performence
benefit as such, you just can see the results of other transactions earlier.
- HBase also has various region-level internal (JVM level) read and write locks that never
outlive an RPC request, such as HRegion.lock and HRegion.updatesLock

I assume you refer to the latter region level locking...?

All updates (put, append, increment, delete, etc) take a *read* lock on the updatesLock to
guard against concurrent flushes (which takes out a write lock). You want this one.
Whenever a region operation is started we take out a read lock on HRegion.lock to guard against
concurrent bulk file operations on that region. This might be a lock we can remove with some

HBase never locks a row for read. (It does take out some internal locks for the duration of
an RPC for internal management, but a row itself is never locked for read. And certainly not
across RPC requests.)

Does that make sense?

-- Lars

----- Original Message -----
From: Michael Segel <michael_segel@hotmail.com>
To: dev@hbase.apache.org
Sent: Friday, September 12, 2014 10:17 AM
Subject: Re: HBase reads, isolation levels and RegionScanner internal locking


I understand. 
However several of the HBase committers aren’t really schooled in RDBMS design.

And again, the older (going back to 0.23 ) use of the term RLL isn’t relational RLL and
when you start to talk about isolation you’re getting in to the RDBMS RLL 

So you really need to define what you mean when you say RLL. I don’t want to assume one
thing when you meant another. 

Just like talking about salts.  ;-) 

On Sep 12, 2014, at 5:53 PM, Vladimir Rodionov <vladrodionov@gmail.com> wrote:

> Michael, this is HBase developers mailing list.
> -Vladimir
> On Fri, Sep 12, 2014 at 12:08 AM, Michael Segel <michael_segel@hotmail.com>
> wrote:
>> Silly question…
>> HBase uses the term RLL (row level locking) to make the writes to a row
>> atomic.
>> When you start to get in to isolation, RLL takes on a different meaning.
>> So now you have to better define what do you mean by locking. Are you
>> taking about HBase RLL,
>> or are you talking about Transactional RLL ( RDBMS RLL) ?
>> On Sep 11, 2014, at 11:58 PM, Vladimir Rodionov <vladrodionov@gmail.com>
>> wrote:
>>> Hi, all
>>> We have two isolation levels in (used to be in Scan) in Query now. See:
>>> https://issues.apache.org/jira/browse/HBASE-11936
>>> I moved isolation levels API from Scan upward to Query class. The reason:
>>> this API was not available for Get operations. The rationals? Improve
>>> performance of get and multi-gets over the same region.
>>> As many of you aware, RegionScannerImpl is heavily synchronized on
>> internal
>>> region's lock.  Now some questions:
>>> 1. Is it safe to bypass this locking (in next() call) in READ_UNCOMMITTED
>>> mode?
>>> We will do all necessary checks, of course, before calling nextRaw().
>>> 2. What was the reason of this locking in a first place for reads in
>>> READ_COMMITTED mode? Except obvious - no-dirty-reads allowed? Can someone
>>> tell me what else bad can happen?
>>> -Vladimir

View raw message