hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seth Ladd <sethl...@gmail.com>
Subject Re: When does a row become highly available?
Date Fri, 11 Dec 2009 21:16:20 GMT
Thanks for the open and informative reply. Looking forward to testing  
0.21 when available!

On Dec 11, 2009, at 11:36 AM, Andrew Purtell <apurtell@apache.org>  
wrote:

> Currently HDFS does not guarantee that a write is fully replicated  
> before
> a sync() call completes. The problem is the write appears to  
> complete from
> the client's perspective -- HBase completes the write RPC -- but  
> really it
> should be blocked for some further period of time. The client won't  
> get a
> failure indication when instead it should so it can know it must  
> retry the
> write. There are configuration options which can narrow this window  
> but
> until HDFS has a working sync() not close it shut tight.
>
> HBase is a "special" client of HDFS in many respects, so while this is
> obviously really important for us, it is not so for the majority of  
> HDFS
> users which run mapreduce jobs on it. HDFS level failures leading to  
> data
> loss result in task retries and recreation of any temporary data  
> lost, no
> harm done. So it has been some time coming. Getting a working sync()  
> in
> Hadoop 0.21 is finally going to happen for us.
>
>   - Andy
>
>
>
>
>
> ________________________________
> From: Jean-Daniel Cryans <jdcryans@apache.org>
> To: hbase-user@hadoop.apache.org
> Sent: Fri, December 11, 2009 10:59:55 AM
> Subject: Re: When does a row become highly available?
>
> That's the not so working HDFS append feature showing it's ugly face,
> small amounts of data can be lost (configurable max of ~62MB).
>
> J-D
>
> On Fri, Dec 11, 2009 at 10:55 AM, Seth Ladd <sethladd@gmail.com>  
> wrote:
>>>> Which confuses me, if the write goes straight to a RegionServer,  
>>>> but
>>>> then the RegionServer fails before the MemStore is flushed, did I  
>>>> just
>>>> lose data?
>>>
>>> No that's the goal of the write-ahead-log (WAL).
>>
>> Here's the scenario I just tested on my EC2 cluster.  3 Zookeeper
>> instances, 1 master, and 3 slaves.
>>
>> I created a table, and inserted a single row.
>> I performed a read (get) to test the insert, and sure enough the row
>> was returned.
>> I then noted which slave held the table, and terminated the slave via
>> the AWS management console.
>> I then waited approx 30 seconds.
>> I used the web interfaces (port 60030 and 60010) to note that the
>> region was indeed moved to another slave.
>> I performed a read on the same row, but did *not* find the row.
>>
>> So it looks like the region for the table was moved, but no data  
>> was moved over.
>>
>> Was that a valid test?  I would expect the row to get moved with  
>> the region.
>>
>> Thanks,
>> Seth
>>
>
>
>

Mime
View raw message