hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Atomicity questions
Date Fri, 02 Dec 2011 00:23:36 GMT
ZK is mostly for orchestrating between the master and regionservers.



----- Original Message -----
From: Mohit Anchlia <mohitanchlia@gmail.com>
To: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
Cc: 
Sent: Thursday, December 1, 2011 3:57 PM
Subject: Re: Atomicity questions

Thanks that makes it more clear. I also looked at mvcc code as you pointed out.

So I am wondering where ZK is used specifically.

On Thu, Dec 1, 2011 at 3:37 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:
> Nope, not using ZK, that would not scale down to the cell level.
> You'll probably have to stare at the code in MultiVersionConsistencyControlfor a while
(I know I had to).
>
> The basic flow of a write operation is this:
> 1. lock the row
>
> 2. persist change to the write ahead log
> 3. get a "writenumber" from mvcc (this is basically a timestamp)
>
> 4. apply change to the memstore (using that write number).
> 5. advance the readpoint (maximum timestamp of changes that reads will see) -- this is
the point where readers see the change
> 6. unlock the row
>
> (7. when memstore is full, flush it to a new disk file, but is done asynchronously, and
not really important, although it has some complicated implications when the flush happens
while there are readers reading from an old read point)
>
>
> The above is relaxed sometimes for idempotent operations.
>
> -- Lars
>
>
> ----- Original Message -----
> From: Mohit Anchlia <mohitanchlia@gmail.com>
> To: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
> Cc:
> Sent: Thursday, December 1, 2011 3:03 PM
> Subject: Re: Atomicity questions
>
> Thanks. I'll try and take a look, but I haven't worked with zookeeper
> before. Does it use zookeeper for any of ACID functionality?
>
> On Thu, Dec 1, 2011 at 2:55 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:
>> Hi Mohit,
>>
>> the best way to study this is to look at MultiVersionConsistencyControl.java (since
you are asking how this handled internally).
>>
>> In a nutshell this ensures that read operations don't see writes that are not completed,
by (1) defining a thread read point that is rolled forward only after a completed operations
and (2) assigning a special timestamp (not the timestamp that you set from the client API)
to all KeyValues.
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Mohit Anchlia <mohitanchlia@gmail.com>
>> To: user@hbase.apache.org
>> Cc:
>> Sent: Thursday, December 1, 2011 2:22 PM
>> Subject: Atomicity questions
>>
>> I have some questions about ACID after reading this page,
>> http://hbase.apache.org/acid-semantics.html
>>
>> - Atomicity point 5 : row must either be "a=1,b=1,c=1" or
>> "a=2,b=2,c=2" and must not be something like "a=1,b=2,c=1".
>>
>> How is this internally handled in hbase such that above is possible?
>>
>>
>
>


Mime
View raw message