hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: When does a row become highly available?
Date Fri, 11 Dec 2009 18:47:33 GMT
On Fri, Dec 11, 2009 at 10:35 AM, Seth Ladd <sethladd@gmail.com> wrote:
>> You are talking about durability, not HA.
>
> Good point, thanks.  I meant HA for the data, but data durability
> makes more sense.
>
>> To have a better understanding I recommend reading our architecture
>> page http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture and the
>> Bigtable paper.
>
> Thanks, I've been studying that today.
>
>> In short, when you write a row it goes into the write-ahead-log and
>> then right after that in MemStore. Once the MemStore is full (64MB) or
>> for some other reasons, it is flushed to disk where the file is
>> replicated (transparently).
>
> Each RegionStore has its own WAL, yes?  From the Architecture page:

Each Region Server, I don't know what RegionStore is ;)

>
> When a write request is received, it is first written to a write-ahead
> log called a HLog. All write requests for every region the region
> server is serving are written to the same log. Once the request has
> been written to the HLog, it is stored in an in-memory cache called
> the Memcache. There is one Memcache for each HStore.
>
> Which confuses me, if the write goes straight to a RegionServer, but
> then the RegionServer fails before the MemStore is flushed, did I just
> lose data?

No that's the goal of the write-ahead-log (WAL).

>
>> If the node fails, the Master will process the WAL so that you don't
>
> So do all writes go through the Master?  Clearly I'm a bit confused here :)

No. The Region Server logs every write in the WAL. If the machine
fails, then whatever is in that WAL will be replayed by the Master
because he's the one noticing the failure. He will then redistribute
the parts of the WAL to other region servers that get assigned with
the region from the dead node.

>
>> lose rows in the MemStore. Prior to Hadoop 0.21 (unreleased), the
>
> Moral of the story is to upgrade to 0.21 ASAP. :)

yes

>
> Thanks!
>
> Seth
>

Mime
View raw message