hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: roadmap: data integrity
Date Fri, 07 Aug 2009 10:25:19 GMT
https://issues.apache.org/jira/browse/HADOOP-4539

This issue was closed long ago. But, Steve Loughran just said on tha
hadoop mailing list that the new NN has to come up with the same
IP/hostname as the failed one.

J-D

On Fri, Aug 7, 2009 at 2:37 AM, Ryan Rawson<ryanobjc@gmail.com> wrote:
> WAL is a major issue, but another one that is coming up fast is the
> SPOF that is the namenode.
>
> Right now, namenode aside, I can rolling restart my entire cluster,
> including rebooting the machines if I needed to. But not so with the
> namenode, because if it does AWOL, all sorts of bad can happen.
>
> I hope that HDFS 0.21 addresses both these issues.  Can we get
> positive confirmation that this is being worked on?
>
> -ryan
>
> On Thu, Aug 6, 2009 at 10:25 AM, Andrew Purtell<apurtell@apache.org> wrote:
>> I updated the roadmap up on the wiki:
>>
>>
>> * Data integrity
>>    * Insure that proper append() support in HDFS actually closes the
>>      WAL last block write hole
>>    * HBase-FSCK (HBASE-7) -- Suggest making this a blocker for 0.21
>>
>> I have had several recent conversations on my travels with people in
>> Fortune 100 companies (based on this list:
>> http://www.wageproject.org/content/fortune/index.php).
>>
>> You and I know we can set up well engineered HBase 0.20 clusters that
>> will be operationally solid for a wide range of use cases, but given
>> those aforementioned discussions there are certain sectors which would
>> say HBASE-7 is #1 before HBase is "bank ready". Not until we can say:
>>
>>  - Yes, when the client sees data has been committed, it actually has
>> been written and replicated on spinning or solid state media in all
>> cases.
>>
>>  - Yes, we go to great lengths to recover data if ${deity} forbid you
>> crush some underprovisioned cluster with load or some bizarre bug or
>> system fault happens.
>>
>> HBASE-1295 is also required for business continuity reasons, but this
>> is already a priority item for some HBase committers.
>>
>> The question is I think does the above align with project goals.
>> Making HBase-FSCK a blocker will probably knock something someone
>> wants for the 0.21 timeframe off the list.
>>
>>   - Andy
>>
>>
>>
>

Mime
View raw message