Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of aji1705@gmail.com designates
 209.85.214.176 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <299B9E77-367A-43FE-A7D3-509A6FBC7B9D@hortonworks.com>
References: 
 <CANrNQ6-DrubkpbEC5fyfOxtzbxp-wSHXp6W2DvoqdWDmrE3Z+Q@mail.gmail.com>
	<CAND0qzutPcy06C3jWCWUTYpUzZSrUo3-BAcOY9MzcisW8NxT6Q@mail.gmail.com>
	<CAF1+Vs8w=B=J1Kc1R8vsG7Kd+e8fiRArCHs-Swz3NfzC97imbg@mail.gmail.com>
	<299B9E77-367A-43FE-A7D3-509A6FBC7B9D@hortonworks.com>
Date: Mon, 13 Aug 2012 09:57:32 -0400
Message-ID: 
 <CANrNQ69Mey-gAwC0pUKkT6_paFQ2AJ8=GW6pkP+uX=qXPR7xVw@mail.gmail.com>
Subject: Re: Hadoop hardware failure recovery
From: Aji Janis <aji1705@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=f46d0445171d3e3f8104c7261345

--f46d0445171d3e3f8104c7261345
Content-Type: text/plain; charset=ISO-8859-1

Thank you everyone for all the feedback and suggestions. Its good to know
these details as I move forward.

Piling on to the question, I am curious if any of you have experience with
Accumulo (a requirement for me hence not optional). I was wondering if the
data loss (physical crash of the hard drive) in this case would be resolved
by Hadoop (HDFS I should say). Any suggestions and/or where I could find
some specs on this would be really appreciated!


Thank you again for all the pointers.
-Aji


On Sun, Aug 12, 2012 at 3:07 PM, Arun C Murthy <acm@hortonworks.com> wrote:

> Yep, hadoop-2 is alpha but is progressing nicely...
>
> However, if you have access to some 'enterprise HA' utilities (VMWare or
> Linux HA) you can get *very decent* production-grade high-availability in
> hadoop-1.x too (both NameNode for HDFS and JobTracker for MapReduce).
>
> Arun
>
> On Aug 10, 2012, at 12:12 PM, anil gupta wrote:
>
> Hi Aji,
>
> Adding onto whatever Mohammad Tariq said, If you use Hadoop 2.0.0-Alpha
> then Namenode is not a single point of failure.However, Hadoop 2.0.0 is not
> of production quality yet(its in Alpha).
> Namenode use to be a Single Point of Failure in releases prior to Hadoop
> 2.0.0.
>
> HTH,
> Anil Gupta
>
> On Fri, Aug 10, 2012 at 11:55 AM, Ted Dunning <tdunning@maprtech.com>wrote:
>
>> Hadoop's file system was (mostly) copied from the concepts of Google's
>> old file system.
>>
>> The original paper is probably the best way to learn about that.
>>
>> http://research.google.com/archive/gfs.html
>>
>>
>>
>> On Fri, Aug 10, 2012 at 11:38 AM, Aji Janis <aji1705@gmail.com> wrote:
>>
>>> I am very new to Hadoop. I am considering setting up a Hadoop cluster
>>> consisting of 5 nodes where each node has 3 internal hard drives. I
>>> understand HDFS has a configurable redundancy feature but what happens if
>>> an entire drive crashes (physically) for whatever reason? How does Hadoop
>>> recover, if it can, from this situation? What else should I know before
>>> setting up my cluster this way? Thanks in advance.
>>>
>>>
>>>
>>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

--f46d0445171d3e3f8104c7261345
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Thank you everyone for all the feedback and suggestions. Its good to know t=
hese details as I move forward.=A0<div><br></div><div>Piling on to the ques=
tion, I am curious if any of you have experience with Accumulo (a requireme=
nt for me hence not optional). I was wondering if the data loss (physical c=
rash of the hard drive) in this case would be resolved by Hadoop (HDFS I sh=
ould say). Any suggestions and/or where I could find some specs on this wou=
ld be really appreciated!=A0<div>
<br></div><div><br></div><div>Thank you again for all the pointers.</div><d=
iv>-Aji<br><div><br><div class=3D"gmail_quote"><br></div><div class=3D"gmai=
l_quote"><br></div><div class=3D"gmail_quote"><br></div><div class=3D"gmail=
_quote">
<br></div><div class=3D"gmail_quote"><br></div><div class=3D"gmail_quote"><=
br></div><div class=3D"gmail_quote">On Sun, Aug 12, 2012 at 3:07 PM, Arun C=
 Murthy <span dir=3D"ltr">&lt;<a href=3D"mailto:acm@hortonworks.com" target=
=3D"_blank">acm@hortonworks.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div style=3D"word-wrap:break-word">Yep, had=
oop-2 is alpha but is progressing nicely...<div><br></div><div>However, if =
you have access to some &#39;enterprise HA&#39; utilities (VMWare or Linux =
HA) you can get *very decent* production-grade high-availability in hadoop-=
1.x too (both NameNode for HDFS and JobTracker for MapReduce).<div>
<br></div><div>Arun</div><div><div><div class=3D"h5"><br><div><div>On Aug 1=
0, 2012, at 12:12 PM, anil gupta wrote:</div><br><blockquote type=3D"cite">=
Hi Aji,<br><br>Adding onto whatever Mohammad Tariq said, If you use Hadoop =
2.0.0-Alpha then Namenode is not a single point of failure.However, Hadoop =
2.0.0 is not of production quality yet(its in Alpha).<br>
Namenode use to be a Single Point of Failure in releases prior to Hadoop 2.=
0.0. <br>

<br>HTH,<br>Anil Gupta<br><br><div class=3D"gmail_quote">On Fri, Aug 10, 20=
12 at 11:55 AM, Ted Dunning <span dir=3D"ltr">&lt;<a href=3D"mailto:tdunnin=
g@maprtech.com" target=3D"_blank">tdunning@maprtech.com</a>&gt;</span> wrot=
e:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div>Hadoop&#39;s file system was (mostly) c=
opied from the concepts of Google&#39;s old file system.</div><div><br></di=
v>


<div>The original paper is probably the best way to learn about that.</div>=
<div><br></div><a href=3D"http://research.google.com/archive/gfs.html" targ=
et=3D"_blank">http://research.google.com/archive/gfs.html</a><div>

<div><div>

<br></div><div><br><br><div class=3D"gmail_quote">On Fri, Aug 10, 2012 at 1=
1:38 AM, Aji Janis <span dir=3D"ltr">&lt;<a href=3D"mailto:aji1705@gmail.co=
m" target=3D"_blank">aji1705@gmail.com</a>&gt;</span> wrote:<br><blockquote=
 class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc soli=
d;padding-left:1ex">


I am very new to Hadoop. I am considering setting=A0up a Hadoop cluster con=
sisting of 5 nodes where each node has 3 internal hard drives. I understand=
 HDFS has a configurable redundancy feature but what happens if an entire d=
rive crashes (physically) for whatever reason? How does Hadoop recover, if =
it can, from this situation? What else should I know before setting up my c=
luster this way? Thanks in advance.<div>


<br></div><div><div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><br>-- <br>Thanks &amp=
; Regards,<br>Anil Gupta<br>
</blockquote></div><br></div></div><div>
<span style=3D"text-indent:0px;letter-spacing:normal;font-variant:normal;te=
xt-align:-webkit-auto;font-style:normal;font-weight:normal;line-height:norm=
al;border-collapse:separate;text-transform:none;font-size:medium;white-spac=
e:normal;font-family:Helvetica;word-spacing:0px"><span style=3D"text-indent=
:0px;letter-spacing:normal;font-variant:normal;text-align:-webkit-auto;font=
-style:normal;font-weight:normal;line-height:normal;border-collapse:separat=
e;text-transform:none;font-size:medium;white-space:normal;font-family:Helve=
tica;word-spacing:0px"><div style=3D"word-wrap:break-word">
<span style=3D"text-indent:0px;letter-spacing:normal;font-variant:normal;te=
xt-align:-webkit-auto;font-style:normal;font-weight:normal;line-height:norm=
al;border-collapse:separate;text-transform:none;font-size:medium;white-spac=
e:normal;font-family:Helvetica;word-spacing:0px"><div style=3D"word-wrap:br=
eak-word">
--</div><div style=3D"word-wrap:break-word">Arun C. Murthy</div><div style=
=3D"word-wrap:break-word">Hortonworks Inc.<br><a href=3D"http://hortonworks=
.com/" target=3D"_blank">http://hortonworks.com/</a><br><br></div></span></=
div></span></span>
</div>
<br></div></div></div></blockquote></div><br></div></div></div>

--f46d0445171d3e3f8104c7261345--