Mailing-List: contact user-help@storm.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@storm.incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of jasonjckn@gmail.com
 designates 209.85.216.44 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAG_xmZU6UNXnB4YW6wbGevgb=NGcscKbKduKMcCqzMRfiBT-7A@mail.gmail.com>
References: 
 <CA+AvHzYWZVuR7=3FFLAwi7U0fsWEmZ8NOq6nmrvLxvmrobXjjQ@mail.gmail.com>
 <CA+AvHzb+dJVVw3rv9pN8jq8WdOYNdD_UDwnNaS07wLEc7+Y7cA@mail.gmail.com>
 <CA+AvHzY3wQSFz2wqBK_p8dVoyBU+uitiK8A_cT-NRfpss0Vt6w@mail.gmail.com>
 <CAG_xmZXjULdXfXG_So=TjpWLt47r3WiKukhOZOJVOtYqvrxSwQ@mail.gmail.com>
 <CA+AvHzY-ThEUHHQuFvAKDTHpA-pJfR2epE-nonVGLfdrpHN39w@mail.gmail.com>
 <CAG_xmZXkqZ6Z0or1Ri3vyV82XuSrJi0xn4WsMtwn47fTB3aC3A@mail.gmail.com>
 <CA+AvHzZ6-x0fi-Kodp_yv04J-Bz0=f1P5OFc5DYHmCdkBqBpZQ@mail.gmail.com>
 <CAG_xmZW1ADCdjjEu5kHvhNqe_77kfruELp2_EDdrX-jYT3yQAA@mail.gmail.com>
 <CAG_xmZU6UNXnB4YW6wbGevgb=NGcscKbKduKMcCqzMRfiBT-7A@mail.gmail.com>
From: Jason Jackson <jasonjckn@gmail.com>
Date: Wed, 9 Apr 2014 14:52:14 -0700
Message-ID: 
 <CAG_xmZWmD14E7TAtDEw=oycYSa3Kpg8Wx-noh-u27JW-3SYU5g@mail.gmail.com>
Subject: Re: Topology is stuck
To: user <user@storm.incubator.apache.org>
Content-Type: multipart/alternative; boundary=001a11353c74426d8804f6a31ec6

--001a11353c74426d8804f6a31ec6
Content-Type: text/plain; charset=ISO-8859-1

Fyi we're using Summingbird in production not Trident. However summingbird
does not give you exactly once semantics, it does give you a higher level
of abstraction than Storm API though.


On Wed, Apr 9, 2014 at 2:50 PM, Jason Jackson <jasonjckn@gmail.com> wrote:

> I have one theory that because reads in zookeeper are eventually
> consistent, this is a necessary condition for the bug to manifest. So one
> way to test this hypothesis is to run a zookeeper ensemble with 1 node, or
> a zookeeper ensemble configured for 5 nodes, but take 2 of them offline, so
> that every write operation only succeeds if every member of the ensemble
> sees the write. This should produce strong consistent reads. If you run
> this test, let me know what the results are. (Clearly this isn't a good
> production system though as you're making a tradeoff for lower availability
> in favor of greater consistency, but the results could help narrow down the
> bug)
>
>
> On Wed, Apr 9, 2014 at 2:43 PM, Jason Jackson <jasonjckn@gmail.com> wrote:
>
>> Yah it's probably a bug in trident. It would be amazing if someone
>> figured out the fix for this. I spent about 6 hours looking into, but
>> couldn't figure out why it was occuring.
>>
>> Beyond fixing this, one thing you could do to buy yourself time is
>> disable batch retries in trident. There's no option for this in the API,
>> but it's like a 1 or 2 line change to the code. Obviously you loose exactly
>> once semantics, but at least you would have a system that never falls
>> behind real-time.
>>
>>
>>
>> On Wed, Apr 9, 2014 at 1:10 AM, Danijel Schiavuzzi <
>> danijel@schiavuzzi.com> wrote:
>>
>>> Thanks Jason. However, I don't think that was the case in my stuck
>>> topology, otherwise I'd have seen exceptions (thrown by my Trident
>>> functions) in the worker logs.
>>>
>>>
>>> On Wed, Apr 9, 2014 at 3:02 AM, Jason Jackson <jasonjckn@gmail.com>wrote:
>>>
>>>> An example of "corrupted input" that causes a batch to fail would be
>>>> for example if you expected a schema to your data that you read off kafka,
>>>> or some queue, and for whatever reason the data didn't conform to your
>>>> schema and the function that you implement that you pass to stream.each
>>>> throws an exception when this unexpected situation occurs. This would cause
>>>> the batch to be retried, but it's deterministically failing, so the batch
>>>> will be retried forever.
>>>>
>>>>
>>>> On Mon, Apr 7, 2014 at 10:37 AM, Danijel Schiavuzzi <
>>>> danijel@schiavuzzi.com> wrote:
>>>>
>>>>> Hi Jason,
>>>>>
>>>>> Could you be more specific -- what do you mean by "corrupted input"?
>>>>> Do you mean that there's a bug in Trident itself that causes the tuples in
>>>>> a batch to somehow become corrupted?
>>>>>
>>>>> Thanks a lot!
>>>>>
>>>>> Danijel
>>>>>
>>>>>
>>>>> On Monday, April 7, 2014, Jason Jackson <jasonjckn@gmail.com> wrote:
>>>>>
>>>>>> This could happen if you have corrupted input that always causes a
>>>>>> batch to fail and be retried.
>>>>>>
>>>>>> I have seen this behaviour before and I didn't see corrupted input.
>>>>>> It might be a bug in trident, I'm not sure. If you figure it out please
>>>>>> update this thread and/or submit a patch.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Mar 31, 2014 at 7:39 AM, Danijel Schiavuzzi <
>>>>>> danijel@schiavuzzi.com> wrote:
>>>>>>
>>>>>> To (partially) answer my own question -- I still have no idea on the
>>>>>> cause of the stuck topology, but re-submitting the topology helps -- after
>>>>>> re-submitting my topology is now running normally.
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 26, 2014 at 6:04 PM, Danijel Schiavuzzi <
>>>>>> danijel@schiavuzzi.com> wrote:
>>>>>>
>>>>>> Also, I did have multiple cases of my IBackingMap workers dying
>>>>>> (because of RuntimeExceptions) but successfully restarting afterwards (I
>>>>>> throw RuntimeExceptions in the BackingMap implementation as my strategy in
>>>>>> rare SQL database deadlock situations to force a worker restart and to
>>>>>> fail+retry the batch).
>>>>>>
>>>>>> From the logs, one such IBackingMap worker death (and subsequent
>>>>>> restart) resulted in the Kafka spout re-emitting the pending tuple:
>>>>>>
>>>>>>     2014-03-22 16:26:43 s.k.t.TridentKafkaEmitter [INFO] re-emitting
>>>>>> batch, attempt 29698959:736
>>>>>>
>>>>>> This is of course the normal behavior of a transactional topology,
>>>>>> but this is the first time I've encountered a case of a batch retrying
>>>>>> indefinitely. This is especially suspicious since the topology has been
>>>>>> running fine for 20 days straight, re-emitting batches and restarting
>>>>>> IBackingMap workers quite a number of times.
>>>>>>
>>>>>> I can see in my IBackingMap backing SQL database that the batch with
>>>>>> the exact txid value 29698959 has been committed -- but I suspect that
>>>>>> could come from another BackingMap, since there are two BackingMap
>>>>>> instances running (paralellismHint 2).
>>>>>>
>>>>>> However, I have no idea why the batch is being retried indefinitely
>>>>>> now nor why it hasn't been successfully acked by Trident.
>>>>>>
>>>>>> Any suggestions on the area (topology component) to focus my research
>>>>>> on?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> On Wed, Mar 26, 2014 at 5:32 PM, Danijel Schiavuzzi <
>>>>>> danijel@schiavuzzi.com> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I'm having problems with my transactional Trident topology. It has
>>>>>> been running fine for about 20 days, and suddenly is stuck processing a
>>>>>> single batch, with no tuples being emitted nor tuples being persisted by
>>>>>> the TridentState (IBackingMap).
>>>>>>
>>>>>> It's a simple topology which consumes messages off a Kafka queue. The
>>>>>> spout is an instance of storm-kafka-0.8-plus TransactionalTridentKafkaSpout
>>>>>> and I use the trident-mssql transactional TridentState implementation to
>>>>>> persistentAggregate() data into a SQL database.
>>>>>>
>>>>>> In Zookeeper I can see Storm is re-trying a batch, i.e.
>>>>>>
>>>>>>      "/transactional/<myTopologyName>/coordinator/currattempts" is
>>>>>> "{"29698959":6487}"
>>>>>>
>>>>>> ... and the attempt count keeps increasing. It seems the batch with
>>>>>> txid 29698959 is stuck, as the attempt count in Zookeeper keeps increasing
>>>>>> -- seems like the batch isn't being acked by Trident and I have no idea
>>>>>> why, especially since the topology has been running successfully the last
>>>>>> 20 days.
>>>>>>
>>>>>> I did rebalance the topology on one occasion, after which it
>>>>>> continued running normally. Other than that, no other modifications were
>>>>>> done. Storm is at version 0.9.0.1.
>>>>>>
>>>>>> Any hints on how to debug the stuck topology? Any other useful info I
>>>>>> might provide?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> --
>>>>>> Danijel Schiavuzzi
>>>>>>
>>>>>> E: danijel@schiavuzzi.com
>>>>>> W: www.schiavuzzi.com
>>>>>> T: +385989035562
>>>>>> Skype: danijel.schiavuzzi
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Danijel Schiavuzzi
>>>>>>
>>>>>> E: danije
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Danijel Schiavuzzi
>>>>>
>>>>> E: danijel@schiavuzzi.com
>>>>> W: www.schiavuzzi.com
>>>>> T: +385989035562
>>>>> Skype: danijels7
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Danijel Schiavuzzi
>>>
>>> E: danijel@schiavuzzi.com
>>> W: www.schiavuzzi.com
>>> T: +385989035562
>>> Skype: danijels7
>>>
>>
>>
>

--001a11353c74426d8804f6a31ec6
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Fyi we&#39;re using Summingbird in production not Trident.=
 However summingbird does not give you exactly once semantics, it does give=
 you a higher level of abstraction than Storm API though.=A0</div><div clas=
s=3D"gmail_extra">

<br><br><div class=3D"gmail_quote">On Wed, Apr 9, 2014 at 2:50 PM, Jason Ja=
ckson <span dir=3D"ltr">&lt;<a href=3D"mailto:jasonjckn@gmail.com" target=
=3D"_blank">jasonjckn@gmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">

<div dir=3D"ltr">I have one theory that because reads in zookeeper are even=
tually consistent, this is a necessary condition for the bug to manifest. S=
o one way to test this hypothesis is to run a zookeeper ensemble with 1 nod=
e, or a zookeeper ensemble configured for 5 nodes, but take 2 of them offli=
ne, so that every write operation only succeeds if every member of the ense=
mble sees the write. This should produce strong consistent reads. If you ru=
n this test, let me know what the results are. (Clearly this isn&#39;t a go=
od production system though as you&#39;re making a tradeoff for lower avail=
ability in favor of greater consistency, but the results could help narrow =
down the bug)</div>

<div class=3D"HOEnZb"><div class=3D"h5">
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Wed, Apr 9=
, 2014 at 2:43 PM, Jason Jackson <span dir=3D"ltr">&lt;<a href=3D"mailto:ja=
sonjckn@gmail.com" target=3D"_blank">jasonjckn@gmail.com</a>&gt;</span> wro=
te:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Yah it&#39;s probably a bug=
 in trident. It would be amazing if someone figured out the fix for this. I=
 spent about 6 hours looking into, but couldn&#39;t figure out why it was o=
ccuring.=A0<div>


<br></div><div>Beyond fixing this, one thing you could do to buy yourself t=
ime is disable batch retries in trident. There&#39;s no option for this in =
the API, but it&#39;s like a 1 or 2 line change to the code. Obviously you =
loose exactly once semantics, but at least you would have a system that nev=
er falls behind real-time.=A0</div>


<div><div>
<div><br></div><div><div><div class=3D"gmail_extra">
<br><br><div class=3D"gmail_quote">On Wed, Apr 9, 2014 at 1:10 AM, Danijel =
Schiavuzzi <span dir=3D"ltr">&lt;<a href=3D"mailto:danijel@schiavuzzi.com" =
target=3D"_blank">danijel@schiavuzzi.com</a>&gt;</span> wrote:<br><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so=
lid;padding-left:1ex">


<div dir=3D"ltr">Thanks Jason. However, I don&#39;t think that was the case=
 in my stuck topology, otherwise I&#39;d have seen exceptions (thrown by my=
 Trident functions) in the worker logs.<br></div><div><div>
<div class=3D"gmail_extra">
<br><br><div class=3D"gmail_quote">On Wed, Apr 9, 2014 at 3:02 AM, Jason Ja=
ckson <span dir=3D"ltr">&lt;<a href=3D"mailto:jasonjckn@gmail.com" target=
=3D"_blank">jasonjckn@gmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">


<div dir=3D"ltr">An example of &quot;corrupted input&quot; that causes a ba=
tch to fail would be for example if you expected a schema to your data that=
 you read off kafka, or some queue, and for whatever reason the data didn&#=
39;t conform to your schema and the function that you implement that you pa=
ss to stream.each throws an exception when this unexpected situation occurs=
. This would cause the batch to be retried, but it&#39;s deterministically =
failing, so the batch will be retried forever.=A0</div>


<div><div>

<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Mon, Apr 7=
, 2014 at 10:37 AM, Danijel Schiavuzzi <span dir=3D"ltr">&lt;<a href=3D"mai=
lto:danijel@schiavuzzi.com" target=3D"_blank">danijel@schiavuzzi.com</a>&gt=
;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi Jason,<div><br></div><div>Could you be mo=
re=A0specific --=A0what do you mean by &quot;corrupted input&quot;? Do=A0yo=
u mean that=A0there&#39;s=A0a bug in Trident itself=A0that causes the tuple=
s in a batch=A0to somehow=A0become corrupted?</div>


<div><br></div><div>Thanks a lot!</div><span><font color=3D"#888888"><div><=
br></div></font></span><div><span><font color=3D"#888888">Danijel</font></s=
pan><div><div><br><br>On Monday, April 7, 2014, Jason Jackson &lt;<a href=
=3D"mailto:jasonjckn@gmail.com" target=3D"_blank">jasonjckn@gmail.com</a>&g=
t; wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr">This could happen if you have corrupted input that always =
causes a batch to fail and be retried.=A0<div><br></div><div>I have seen th=
is behaviour before and I didn&#39;t see corrupted input. It might be a bug=
 in trident, I&#39;m not sure. If you figure it out please update this thre=
ad and/or submit a patch.=A0</div>


<div><br></div></div><div><br><br><div>On Mon, Mar 31, 2014 at 7:39 AM, Dan=
ijel Schiavuzzi <span dir=3D"ltr">&lt;<a>danijel@schiavuzzi.com</a>&gt;</sp=
an> wrote:<br>

<blockquote style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-l=
eft:1ex"><div dir=3D"ltr">To (partially) answer my own question -- I still =
have no idea on the cause of the stuck topology, but re-submitting the topo=
logy helps -- after re-submitting my topology is now running normally.<br>


</div><div><div><div>
<br><br><div>On Wed, Mar 26, 2014 at 6:04 PM, Danijel Schiavuzzi <span dir=
=3D"ltr">&lt;<a>danijel@schiavuzzi.com</a>&gt;</span> wrote:<br><blockquote=
 style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div dir=3D"ltr"><div><div>Also, I did have multiple  cases of my IBackingM=
ap workers dying (because of RuntimeExceptions) but successfully restarting=
 afterwards (I throw RuntimeExceptions in the BackingMap implementation as =
my strategy in rare SQL database deadlock situations to force a worker rest=
art and to fail+retry the batch).<br>


<br></div><div>From the logs, one such IBackingMap worker death (and subseq=
uent restart) resulted in the Kafka spout re-emitting the pending tuple:<br=
><br>=A0=A0=A0 2014-03-22 16:26:43 s.k.t.TridentKafkaEmitter [INFO] re-emit=
ting batch, attempt 29698959:736<br>


</div><div><br></div><div>This is of course the normal behavior of a transa=
ctional topology, but this is the first time I&#39;ve encountered a case of=
 a batch retrying indefinitely. This is especially suspicious since the top=
ology has been running fine for 20 days straight, re-emitting batches and r=
estarting IBackingMap workers quite a number of times.<br>


<br>I can see in my IBackingMap backing SQL database that the batch with th=
e exact txid value 29698959 has been committed -- but I suspect that could =
come from another BackingMap, since there are two BackingMap instances runn=
ing (paralellismHint 2).<br>


</div><br></div><div>However, I have no idea why the batch is being retried=
 indefinitely now nor why it hasn&#39;t been successfully acked by Trident.=
<br><br>Any suggestions on the area (topology component) to focus my resear=
ch on?<br>


</div><div><br></div><div>Thanks,<br></div><div><div><div><br><div>On Wed, =
Mar 26, 2014 at 5:32 PM, Danijel Schiavuzzi <span dir=3D"ltr">&lt;<a>danije=
l@schiavuzzi.com</a>&gt;</span> wrote:<br>


<blockquote style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204=
,204,204);padding-left:1ex"><div><div><div dir=3D"ltr"><div>Hello,<br><br>I=
&#39;m having problems with my transactional Trident topology. It has been =
running fine for about 20 days, and suddenly is stuck processing a single b=
atch, with no tuples being emitted nor tuples being persisted by the Triden=
tState (IBackingMap).<br>


<br>It&#39;s a simple topology which consumes messages off a Kafka queue. T=
he=20
spout is an instance of storm-kafka-0.8-plus=20
TransactionalTridentKafkaSpout and I use the trident-mssql transactional
 TridentState implementation to persistentAggregate() data into a SQL datab=
ase.<br><br></div><div>In Zookeeper I can see Storm is re-trying a batch, i=
.e.<br><br>=A0=A0=A0=A0 &quot;/transactional/&lt;myTopologyName&gt;/coordin=
ator/currattempts&quot; is &quot;{&quot;29698959&quot;:6487}&quot;<br>


</div><div></div><div><br>... and the attempt count keeps increasing. It se=
ems the batch with txid 29698959 is stuck, as the attempt count in Zookeepe=
r keeps increasing -- seems like the batch isn&#39;t being acked by Trident=
 and I have no idea why, especially since the topology has been running suc=
cessfully the last 20 days.<br>


<br></div><div>I did rebalance the topology on one occasion, after which it=
 continued running normally. Other than that, no other modifications were d=
one. Storm is at version 0.9.0.1.<br></div><div><br></div><div>Any hints on=
 how to debug the stuck topology? Any other useful info I might provide?<br=
>


</div><div><div><br></div><div>Thanks,<br clear=3D"all"></div><div><br>-- <=
br>Danijel Schiavuzzi<br><span><font color=3D"#888888"><br><span>E: </span>=
<a>danijel@schiavuzzi.com</a><br>


<span>W: </span><a href=3D"http://www.schiavuzzi.com/" target=3D"_blank">ww=
w.schiavuzzi.com</a><br><span>T: </span><a value=3D"+385989035562">+3859890=
35562</a><br><span>Skype: danijel.schiavuzzi</span></font></span>
</div></div></div>
</div></div></blockquote></div><br><br clear=3D"all"><br>-- <br>Danijel Sch=
iavuzzi<br><span><font color=3D"#888888"><br><span>E: </span><a>danije</a><=
/font></span></div></div></div></div></blockquote></div></div></div></div>


</blockquote></div></div></blockquote></div></div></div><br><div><div><br>-=
- <br>Danijel Schiavuzzi<br><span><font color=3D"#888888"><br><span>E: </sp=
an><a href=3D"mailto:danijel@schiavuzzi.com" target=3D"_blank">danijel@schi=
avuzzi.com</a><br>


<span>W: </span><a href=3D"http://www.schiavuzzi.com/" target=3D"_blank">ww=
w.schiavuzzi.com</a><br><span>T: </span><a value=3D"+385989035562">+3859890=
35562</a><br><span>Skype: danijels7</span></font></span><br>


</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><br>-- <br>Danijel Sch=
iavuzzi<br><span><font color=3D"#888888"><br><span>E: </span><a href=3D"mai=
lto:danijel@schiavuzzi.com" target=3D"_blank">danijel@schiavuzzi.com</a><br=
>

<span>W: </span><a href=3D"http://www.schiavuzzi.com/" target=3D"_blank">ww=
w.schiavuzzi.com</a><br><span>T: </span><a value=3D"+385989035562">+3859890=
35562</a><br><span>Skype: danijels7</span></font></span>
</div>
</div></div></blockquote></div><br></div></div></div></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>

--001a11353c74426d8804f6a31ec6--