Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of dechouxb@gmail.com designates
 209.85.215.45 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CABGkNE91FtSMv6dvdi7n+SL8YK=76cyqPoiP0gT6htDv_RRcFg@mail.gmail.com>
References: 
 <CANx3uAiVPmA3S6LXkO77k11ibCWjfSy2fzwAeKn=PrF2RVxXbw@mail.gmail.com>
	<CABGkNE9VsEObAQS2ze32v8SiP660Zq_su2rteh3Tf4a89og_5Q@mail.gmail.com>
	<CANx3uAiap-PqVGRLa7QgTpEJH=LxTqC7pRkor0LuJ2bvbmZTJw@mail.gmail.com>
	<CABGkNE91FtSMv6dvdi7n+SL8YK=76cyqPoiP0gT6htDv_RRcFg@mail.gmail.com>
Date: Wed, 16 Oct 2013 07:16:28 +0200
Message-ID: 
 <CAO6W-2cXf7L-Y4Wn99C0gvPJM7eMki7j6myJa8sTkNaTpAAQHg@mail.gmail.com>
Subject: Re: high availability
From: Bertrand Dechoux <dechouxb@gmail.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Type: multipart/alternative; boundary=001a113456eab70a5e04e8d4cd06

--001a113456eab70a5e04e8d4cd06
Content-Type: text/plain; charset=ISO-8859-1

http://blog.cloudera.com/blog/2012/10/quorum-based-journaling-in-cdh4-1/

Old version (4.1) but the principle is still the same.

*No requirement for custom fencing configuration *- fencing methods such as
STONITH <http://en.wikipedia.org/wiki/STONITH> require custom hardware;
instead, we should rely only on software methods.

Bertrand

PS: But then the only true validation is by testing it.

On Tue, Oct 15, 2013 at 10:59 PM, Jing Zhao <jing@hortonworks.com> wrote:

> I think a real fencing is not required in case that you're using
> QJM-based HA. If you are using ZKFC, a graceful fencing will first be
> triggered in which ZKFC will send a RPC request to the original ANN to
> make it standby. If the graceful fencing failed the configured fencing
> will be used. In the worst case that your original ANN cannot
> transition to standby state, QJM still has built-in single-writer
> semantics (see https://issues.apache.org/jira/browse/HDFS-3862,
> https://issues.apache.org/jira/browse/HDFS-4915). Thus you can set the
> fence method to shell(/bin/true) (since in the current code the fence
> configuration is still required).
>
> On Tue, Oct 15, 2013 at 12:11 PM, Koert Kuipers <koert@tresata.com> wrote:
> > Jing,
> > thanks for your answer.
> >
> > if hbase with high availability is the desired goal, is it recommended to
> > remove sshfence? we do not plan to use hdfs for anything else.
> >
> > i understood that the only downside of no fencing is that the old
> namenode
> > could still be serving read requests. could this negatively impact hbase
> > functionality, or worse, could it corrupt hbase somehow (not sure how
> that
> > would be...)?
> >
> > thanks! koert
> >
> >
> >
> > On Tue, Oct 15, 2013 at 12:38 AM, Jing Zhao <jing@hortonworks.com>
> wrote:
> >>
> >> "it is unclear to me if the transition in this case is also rapid but
> >> the fencing takes long while the new namenode is already active, or if
> >> in this period i am stuck without an active namenode."
> >>
> >> The standby->active transition will get stuck in this period, i.e.,
> >> the NN can only become active after fencing the old active NN. During
> >> this period since the only NN is in standby state which cannot handle
> >> usual R/W operations and just throws StandbyException, hbase region
> >> server may kill itself in some cases I guess.
> >>
> >> I think you can remove sshfence from the configuration if you are
> >> using QJM-based HA.
> >>
> >> On Fri, Oct 11, 2013 at 4:51 PM, Koert Kuipers <koert@tresata.com>
> wrote:
> >> > i have been playing with high availability using journalnodes and 2
> >> > masters
> >> > both running namenode and hbase master.
> >> >
> >> > when i kill the namenode and hbase-master processes on the active
> >> > master,
> >> > the failover is perfect. hbase never stops and a running map-reduce
> jobs
> >> > keeps going. this is impressive!
> >> >
> >> > however when instead of killing the proceses i kill the entire active
> >> > master
> >> > machine, the transactions is less smooth and can take a long time, at
> >> > least
> >> > it seems this way in the logs. this is because ssh fencing fails but
> >> > keeps
> >> > trying. my fencing is configured as:
> >> >
> >> >  <property>
> >> >     <name>dfs.ha.fencing.methods</name>
> >> >     <value>
> >> >       sshfence
> >> >       shell(/bin/true)
> >> >     </value>
> >> >     <final>true</final>
> >> >   </property>
> >> >
> >> > it is unclear to me if the transition in this case is also rapid but
> the
> >> > fencing takes long while the new namenode is already active, or if in
> >> > this
> >> > period i am stuck without an active namenode. it is hard to accurately
> >> > test
> >> > this in my setup.
> >> > is this supposed to take this long? is HDFS writable in this period?
> and
> >> > is
> >> > hbase supposed to survive this long transition?
> >> >
> >> > thanks! koert
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or entity
> >> to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> >> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> >> immediately
> >> and delete it from your system. Thank You.
> >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


-- 
Bertrand Dechoux

--001a113456eab70a5e04e8d4cd06
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><a href=3D"http://blog.cloudera.com/blog/2012/10/quor=
um-based-journaling-in-cdh4-1/">http://blog.cloudera.com/blog/2012/10/quoru=
m-based-journaling-in-cdh4-1/</a><br></div><br>Old version (4.1) but the pr=
inciple is still the same.<br>
<br><b>No requirement for custom fencing configuration=A0</b>- fencing meth=
ods such as <a href=3D"http://en.wikipedia.org/wiki/STONITH">STONITH</a>=A0=
require custom hardware; instead, we should rely only on software methods.<=
br>
<div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">Bertrand<br=
><br></div><div class=3D"gmail_extra">PS: But then the only true validation=
 is by testing it.<br></div><div class=3D"gmail_extra"><br><div class=3D"gm=
ail_quote">
On Tue, Oct 15, 2013 at 10:59 PM, Jing Zhao <span dir=3D"ltr">&lt;<a href=
=3D"mailto:jing@hortonworks.com" target=3D"_blank">jing@hortonworks.com</a>=
&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0=
 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I think a real fencing is not required in case that you&#39;re using<br>
QJM-based HA. If you are using ZKFC, a graceful fencing will first be<br>
triggered in which ZKFC will send a RPC request to the original ANN to<br>
make it standby. If the graceful fencing failed the configured fencing<br>
will be used. In the worst case that your original ANN cannot<br>
transition to standby state, QJM still has built-in single-writer<br>
semantics (see <a href=3D"https://issues.apache.org/jira/browse/HDFS-3862" =
target=3D"_blank">https://issues.apache.org/jira/browse/HDFS-3862</a>,<br>
<a href=3D"https://issues.apache.org/jira/browse/HDFS-4915" target=3D"_blan=
k">https://issues.apache.org/jira/browse/HDFS-4915</a>). Thus you can set t=
he<br>
fence method to shell(/bin/true) (since in the current code the fence<br>
configuration is still required).<br>
<div class=3D"HOEnZb"><div class=3D"h5"><br>
On Tue, Oct 15, 2013 at 12:11 PM, Koert Kuipers &lt;<a href=3D"mailto:koert=
@tresata.com">koert@tresata.com</a>&gt; wrote:<br>
&gt; Jing,<br>
&gt; thanks for your answer.<br>
&gt;<br>
&gt; if hbase with high availability is the desired goal, is it recommended=
 to<br>
&gt; remove sshfence? we do not plan to use hdfs for anything else.<br>
&gt;<br>
&gt; i understood that the only downside of no fencing is that the old name=
node<br>
&gt; could still be serving read requests. could this negatively impact hba=
se<br>
&gt; functionality, or worse, could it corrupt hbase somehow (not sure how =
that<br>
&gt; would be...)?<br>
&gt;<br>
&gt; thanks! koert<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; On Tue, Oct 15, 2013 at 12:38 AM, Jing Zhao &lt;<a href=3D"mailto:jing=
@hortonworks.com">jing@hortonworks.com</a>&gt; wrote:<br>
&gt;&gt;<br>
&gt;&gt; &quot;it is unclear to me if the transition in this case is also r=
apid but<br>
&gt;&gt; the fencing takes long while the new namenode is already active, o=
r if<br>
&gt;&gt; in this period i am stuck without an active namenode.&quot;<br>
&gt;&gt;<br>
&gt;&gt; The standby-&gt;active transition will get stuck in this period, i=
.e.,<br>
&gt;&gt; the NN can only become active after fencing the old active NN. Dur=
ing<br>
&gt;&gt; this period since the only NN is in standby state which cannot han=
dle<br>
&gt;&gt; usual R/W operations and just throws StandbyException, hbase regio=
n<br>
&gt;&gt; server may kill itself in some cases I guess.<br>
&gt;&gt;<br>
&gt;&gt; I think you can remove sshfence from the configuration if you are<=
br>
&gt;&gt; using QJM-based HA.<br>
&gt;&gt;<br>
&gt;&gt; On Fri, Oct 11, 2013 at 4:51 PM, Koert Kuipers &lt;<a href=3D"mail=
to:koert@tresata.com">koert@tresata.com</a>&gt; wrote:<br>
&gt;&gt; &gt; i have been playing with high availability using journalnodes=
 and 2<br>
&gt;&gt; &gt; masters<br>
&gt;&gt; &gt; both running namenode and hbase master.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; when i kill the namenode and hbase-master processes on the ac=
tive<br>
&gt;&gt; &gt; master,<br>
&gt;&gt; &gt; the failover is perfect. hbase never stops and a running map-=
reduce jobs<br>
&gt;&gt; &gt; keeps going. this is impressive!<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; however when instead of killing the proceses i kill the entir=
e active<br>
&gt;&gt; &gt; master<br>
&gt;&gt; &gt; machine, the transactions is less smooth and can take a long =
time, at<br>
&gt;&gt; &gt; least<br>
&gt;&gt; &gt; it seems this way in the logs. this is because ssh fencing fa=
ils but<br>
&gt;&gt; &gt; keeps<br>
&gt;&gt; &gt; trying. my fencing is configured as:<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; =A0&lt;property&gt;<br>
&gt;&gt; &gt; =A0 =A0 &lt;name&gt;dfs.ha.fencing.methods&lt;/name&gt;<br>
&gt;&gt; &gt; =A0 =A0 &lt;value&gt;<br>
&gt;&gt; &gt; =A0 =A0 =A0 sshfence<br>
&gt;&gt; &gt; =A0 =A0 =A0 shell(/bin/true)<br>
&gt;&gt; &gt; =A0 =A0 &lt;/value&gt;<br>
&gt;&gt; &gt; =A0 =A0 &lt;final&gt;true&lt;/final&gt;<br>
&gt;&gt; &gt; =A0 &lt;/property&gt;<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; it is unclear to me if the transition in this case is also ra=
pid but the<br>
&gt;&gt; &gt; fencing takes long while the new namenode is already active, =
or if in<br>
&gt;&gt; &gt; this<br>
&gt;&gt; &gt; period i am stuck without an active namenode. it is hard to a=
ccurately<br>
&gt;&gt; &gt; test<br>
&gt;&gt; &gt; this in my setup.<br>
&gt;&gt; &gt; is this supposed to take this long? is HDFS writable in this =
period? and<br>
&gt;&gt; &gt; is<br>
&gt;&gt; &gt; hbase supposed to survive this long transition?<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; thanks! koert<br>
&gt;&gt;<br>
&gt;&gt; --<br>
&gt;&gt; CONFIDENTIALITY NOTICE<br>
&gt;&gt; NOTICE: This message is intended for the use of the individual or =
entity<br>
&gt;&gt; to<br>
&gt;&gt; which it is addressed and may contain information that is confiden=
tial,<br>
&gt;&gt; privileged and exempt from disclosure under applicable law. If the=
 reader<br>
&gt;&gt; of this message is not the intended recipient, you are hereby noti=
fied<br>
&gt;&gt; that<br>
&gt;&gt; any printing, copying, dissemination, distribution, disclosure or<=
br>
&gt;&gt; forwarding of this communication is strictly prohibited. If you ha=
ve<br>
&gt;&gt; received this communication in error, please contact the sender<br=
>
&gt;&gt; immediately<br>
&gt;&gt; and delete it from your system. Thank You.<br>
&gt;<br>
&gt;<br>
<br>
--<br>
CONFIDENTIALITY NOTICE<br>
NOTICE: This message is intended for the use of the individual or entity to=
<br>
which it is addressed and may contain information that is confidential,<br>
privileged and exempt from disclosure under applicable law. If the reader<b=
r>
of this message is not the intended recipient, you are hereby notified that=
<br>
any printing, copying, dissemination, distribution, disclosure or<br>
forwarding of this communication is strictly prohibited. If you have<br>
received this communication in error, please contact the sender immediately=
<br>
and delete it from your system. Thank You.<br>
</div></div></blockquote></div><br><br clear=3D"all"><br>-- <br>Bertrand De=
choux
</div></div>

--001a113456eab70a5e04e8d4cd06--