hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Hadoop hardware failure recovery
Date Mon, 13 Aug 2012 14:55:43 GMT
Aji,

The best place would be to ask on Apache Accumulo's own user lists,
subscrib-able at http://accumulo.apache.org/mailing_list.html

That said, if Accumulo bases itself on HDFS, then its data safety
should be the same or nearly the same as what HDFS itself can offer.

Note that with 2.1.0 (upcoming) and above releases of HDFS, we offer a
working hsync() API that allows you to write files with guarantee that
it has been written to the disk (like the fsync() *nix call). You can
read some more about this at an earlier thread:
http://search-hadoop.com/m/ATVOETSy4X1

HTH, and do let us know what you find on the Accumulo side.

On Mon, Aug 13, 2012 at 7:27 PM, Aji Janis <aji1705@gmail.com> wrote:
> Thank you everyone for all the feedback and suggestions. Its good to know
> these details as I move forward.
>
> Piling on to the question, I am curious if any of you have experience with
> Accumulo (a requirement for me hence not optional). I was wondering if the
> data loss (physical crash of the hard drive) in this case would be resolved
> by Hadoop (HDFS I should say). Any suggestions and/or where I could find
> some specs on this would be really appreciated!
>
>
> Thank you again for all the pointers.
> -Aji
>
>
>
>
>
>
>
>
> On Sun, Aug 12, 2012 at 3:07 PM, Arun C Murthy <acm@hortonworks.com> wrote:
>>
>> Yep, hadoop-2 is alpha but is progressing nicely...
>>
>> However, if you have access to some 'enterprise HA' utilities (VMWare or
>> Linux HA) you can get *very decent* production-grade high-availability in
>> hadoop-1.x too (both NameNode for HDFS and JobTracker for MapReduce).
>>
>> Arun
>>
>> On Aug 10, 2012, at 12:12 PM, anil gupta wrote:
>>
>> Hi Aji,
>>
>> Adding onto whatever Mohammad Tariq said, If you use Hadoop 2.0.0-Alpha
>> then Namenode is not a single point of failure.However, Hadoop 2.0.0 is not
>> of production quality yet(its in Alpha).
>> Namenode use to be a Single Point of Failure in releases prior to Hadoop
>> 2.0.0.
>>
>> HTH,
>> Anil Gupta
>>
>> On Fri, Aug 10, 2012 at 11:55 AM, Ted Dunning <tdunning@maprtech.com>
>> wrote:
>>>
>>> Hadoop's file system was (mostly) copied from the concepts of Google's
>>> old file system.
>>>
>>> The original paper is probably the best way to learn about that.
>>>
>>> http://research.google.com/archive/gfs.html
>>>
>>>
>>>
>>> On Fri, Aug 10, 2012 at 11:38 AM, Aji Janis <aji1705@gmail.com> wrote:
>>>>
>>>> I am very new to Hadoop. I am considering setting up a Hadoop cluster
>>>> consisting of 5 nodes where each node has 3 internal hard drives. I
>>>> understand HDFS has a configurable redundancy feature but what happens if
an
>>>> entire drive crashes (physically) for whatever reason? How does Hadoop
>>>> recover, if it can, from this situation? What else should I know before
>>>> setting up my cluster this way? Thanks in advance.
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>>
>>
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>



-- 
Harsh J

Mime
View raw message