Mailing-List: contact user-help@helix.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@helix.incubator.apache.org
Received-SPF: pass (nike.apache.org: domain of g.kishore@gmail.com designates
 209.85.212.179 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <1A11C172-6519-493C-A4A9-A66194CC4B8E@mac.com>
References: <28CB11C1-1D3F-4EAE-BCEC-41EC6CA84604@mac.com>
	<CABaj-QZy=t+_f72XYStOn77Fu9mihn-97L6E_41y=6qh+cj=gw@mail.gmail.com>
	<1A11C172-6519-493C-A4A9-A66194CC4B8E@mac.com>
Date: Sat, 4 May 2013 09:25:28 -0700
Message-ID: 
 <CABaj-QYLGJmaj+=-E=eMjjR8NE3EnfTMqm1YBqUVDReSVF1v4Q@mail.gmail.com>
Subject: Re: Long GC
From: kishore g <g.kishore@gmail.com>
To: user@helix.incubator.apache.org
Content-Type: multipart/alternative; boundary=047d7ba977be69aedf04dbe6eade

--047d7ba977be69aedf04dbe6eade
Content-Type: text/plain; charset=ISO-8859-1

Hi Ming

I dont see anything wrong with the design. What you need is the ability to
validate few things before reconnecting to cluster. We do invoke a
preconnect callback before joining the cluster.you can validate for
consistency and refuse to join the cluster. You can also disable the node
if validation fails
Will this work
On May 4, 2013 9:03 AM, "Ming Fang" <mingfang@mac.com> wrote:

> Kishore
>
> I'm setting _sessionTimeout to 3 seconds.
> That's an aggressive number, but my applications needs to detect failures
> quickly.
> I suppose taking the participant to OFFLINE is acceptable but I can't have
> it flip back to MASTER.
>
> I didn't want to bore you with the details before but I think I need to
> explain my system more now.
> We are using Helix to manage a MASTER/SLAVE cluster using AUTO mode.
> AUTO mode enable us to place the MASTER and SLAVE to the correct host.
> We name the MASTER as Node1 and SLAVE as Node2.
>
> The system processes a high rate of incoming events, thousands per second.
> Node1 consumes the events, generate internal state, and then replicates
> event to the Node2.
> Node2 will consume events from the Node1 and generates exactly same
> internal state.
>
> When Node1 fails, we want Node2 to become new MASTER and process incoming
> events.
> This means we can not restart Node1 since the Node2's state has move
> beyond the failed MASTER.
> We keep the failed Node1 down for the rest of the business day.
> Everything works as expected under ideal situation.
>
> The problem we're experiencing with long GCs is that Node1 transitions to
> OFFLINE and then back to MASTER.
> This causes the Node1 and Node2 to get out of sync.
>
> Ideally I can find a general solution such that whenever Node2 becomes
> MASTER, it modifies the Ideal state so that Node1 can come back as SLAVE.
> This solution will address the Node1 failure issue and think should fix
> the long GC issue too.
> Sorry for the long email.
>
> --ming
>
>
>
> On May 4, 2013, at 10:29 AM, kishore g <g.kishore@gmail.com> wrote:
>
> Hi Ming,
>
> Need some more details,
> 1. How long was the GC, what is the session timeout in zk.
>
> Behavior you are seeing is expected, what is happening is due to GC and
> losing zookeeper session we call the transitions so that partition goes
> back to OFFLINE state.
>
> What is the behavior you are looking for when there is GC.
>
> a. You dont want to lose mastership ? or
> b. Its ok to lose mastership but you dont want to become master again ?
>
> One question regarding your application, is it possible your application
> can recover after long GC pause?
>
> Dont think this is related to HELIX-79, in that case there were
> consecutive GC's and I think we have a patch for that issue.
>
> Thanks,
> Kishore G
>
>
> On Sat, May 4, 2013 at 6:32 AM, Ming Fang <mingfang@mac.com> wrote:
>
>> We're experiencing a potentially showstopper issue with how Helix is
>> dealing with very long GCs.
>> Our system is using the Master Slave model.
>> A simple test when running just the Master under extreme load, causing
>> seconds of GC.
>> Under long GC condition the Master gets transitioned to Slave then to
>> Offline.
>> After the GC, we get transited back to Slave then to Master.
>>
>> I found this Jira that may be related HELIX-79<https://issues.apache.org/jira/browse/HELIX-79>
>> .
>> We're scheduled to go live with our system next week.
>> Are there any quick workarounds for this problem?
>>
>>
>>
>
>

--047d7ba977be69aedf04dbe6eade
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<p>Hi Ming</p>
<p>I dont see anything wrong with the design. What you need is the ability =
to validate few things before reconnecting to cluster. We do invoke a preco=
nnect callback before joining the cluster.you can validate for consistency =
and refuse to join the cluster. You can also disable the node if validation=
 fails<br>

Will this work</p>
<div class=3D"gmail_quote">On May 4, 2013 9:03 AM, &quot;Ming Fang&quot; &l=
t;<a href=3D"mailto:mingfang@mac.com">mingfang@mac.com</a>&gt; wrote:<br ty=
pe=3D"attribution"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style=3D"word-wrap:break-word">Kishore<div><br></div><div>I&#39;m sett=
ing=A0_sessionTimeout to 3 seconds.</div><div>That&#39;s an aggressive numb=
er, but my applications needs to detect failures quickly.</div><div>I suppo=
se taking the participant to OFFLINE is acceptable but I can&#39;t have it =
flip back to MASTER.</div>
<div><br></div><div>I didn&#39;t want to bore you with the details before b=
ut I think I need to explain my system more now.</div><div>We are using Hel=
ix to manage a MASTER/SLAVE cluster using AUTO mode.</div><div>AUTO mode en=
able us to place the MASTER and SLAVE to the correct host.</div>
<div>We name the MASTER as Node1 and SLAVE as Node2.=A0</div><div><br></div=
><div>The system processes a high rate of incoming events, thousands per se=
cond.</div><div>Node1 consumes the events, generate internal state, and the=
n replicates event to the Node2.</div>
<div>Node2 will consume events from the Node1 and generates exactly same in=
ternal state.</div><div><br></div><div>When Node1 fails, we want Node2 to b=
ecome new MASTER and process incoming events.</div><div>This means we can n=
ot restart Node1 since the Node2&#39;s state has move beyond the failed MAS=
TER.</div>
<div>We keep the failed Node1 down for the rest of the business day.</div><=
div>Everything works as expected under ideal situation.</div><div><br></div=
><div>The problem we&#39;re experiencing with long GCs is that Node1 transi=
tions to OFFLINE and then back to MASTER.</div>
<div>This causes the Node1 and Node2 to get out of sync.</div><div><br></di=
v><div>Ideally I can find a general solution such that whenever Node2 becom=
es MASTER, it modifies the Ideal state so that Node1 can come back as SLAVE=
.</div>
<div>This solution will address the Node1 failure issue and think should fi=
x the long GC issue too.</div><div>Sorry for the long email.</div><div><br>=
</div><div>--ming=A0</div><div><br></div><div><br></div><div><br><div><div>
On May 4, 2013, at 10:29 AM, kishore g &lt;<a href=3D"mailto:g.kishore@gmai=
l.com" target=3D"_blank">g.kishore@gmail.com</a>&gt; wrote:</div><br><block=
quote type=3D"cite"><div dir=3D"ltr">Hi Ming,<div><br></div><div>Need some =
more details,</div>
<div>1. How long was the GC, what is the session timeout in zk.</div><div><=
br></div><div>Behavior you are seeing is expected, what is happening is due=
 to GC and losing zookeeper session we call the transitions so that partiti=
on goes back to OFFLINE state.=A0</div>

<div><br></div><div>What is the behavior you are looking for when there is =
GC.</div><div><br></div><div>a. You dont want to lose mastership ? or</div>=
<div>b. Its ok to lose mastership but you dont want to become master again =
?</div>

<div><br></div><div>One question regarding your application, is it possible=
 your application can recover after long GC pause?</div><div><br></div><div=
>Dont think this is related to HELIX-79, in that case there were consecutiv=
e GC&#39;s and I think we have a patch for that issue.</div>

<div><br></div><div>Thanks,</div><div>Kishore G</div></div><div class=3D"gm=
ail_extra"><br><br><div class=3D"gmail_quote">On Sat, May 4, 2013 at 6:32 A=
M, Ming Fang <span dir=3D"ltr">&lt;<a href=3D"mailto:mingfang@mac.com" targ=
et=3D"_blank">mingfang@mac.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div style=3D"word-wrap:break-word">We&#39;r=
e experiencing a potentially showstopper issue with how Helix is dealing wi=
th very long GCs.<div>

Our system is using the Master Slave model.</div><div>A simple test when ru=
nning just the Master under extreme load, causing seconds of GC.</div><div>=
Under long GC condition the Master gets transitioned to Slave then to Offli=
ne.</div>

<div>After the GC, we get transited back to Slave then to Master.</div><div=
><br></div><div>I found this Jira that may be related=A0<a href=3D"https://=
issues.apache.org/jira/browse/HELIX-79" target=3D"_blank">HELIX-79</a>.</di=
v>

<div>We&#39;re scheduled to go live with our system next week.</div><div>Ar=
e there any quick workarounds for this problem?</div><div><br></div><div><b=
r></div></div></blockquote></div><br></div>
</blockquote></div><br></div></div></blockquote></div>

--047d7ba977be69aedf04dbe6eade--