Return-Path: X-Original-To: apmail-storm-user-archive@minotaur.apache.org Delivered-To: apmail-storm-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9AE201092F for ; Thu, 2 Jan 2014 11:55:57 +0000 (UTC) Received: (qmail 69475 invoked by uid 500); 2 Jan 2014 11:55:49 -0000 Delivered-To: apmail-storm-user-archive@storm.apache.org Received: (qmail 69453 invoked by uid 500); 2 Jan 2014 11:55:47 -0000 Mailing-List: contact user-help@storm.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@storm.incubator.apache.org Delivered-To: mailing list user@storm.incubator.apache.org Received: (qmail 69436 invoked by uid 99); 2 Jan 2014 11:55:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jan 2014 11:55:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [74.125.82.51] (HELO mail-wg0-f51.google.com) (74.125.82.51) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jan 2014 11:55:36 +0000 Received: by mail-wg0-f51.google.com with SMTP id b13so12134210wgh.18 for ; Thu, 02 Jan 2014 03:54:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:references:in-reply-to:mime-version :thread-index:date:message-id:subject:to:content-type; bh=XtBW5zcAtjzw27XFqffO4EgjI/HaJknJOtc0e6fKnZ0=; b=iY5byLJFJ1GCx4kI3md3X5dvsqiBqrt7gEv9qVtjt06DuoKzHzrEbLbuSH0Y92OEFA TdDstR+h7xSUwL+K5FsZqo3BoEkQk269FQFyyVTENjVe94nM2ZBWxwhGzJQqUyDJnhek kWBGi16JX9D7wTYIuofjqfB56enVpzdXtrlzLkrmv8cjX2nj7DGv6I1yxklNsqoEbdBV EzlBOFNb1aUsxm72/JD1yQAecuHHYrf/xveypQcAKe/ZB1DATUeTyIEp1hj0EaUzUfSy SE8hcKTCRZumlLuTatMeieswqub+ygUfrppikAWHiP45Z7C94R1b1hX/J/kQILWcDtZI EObA== X-Gm-Message-State: ALoCoQkA5DCq8cySniIgbjQF7pHj6jNPGa1paI1tDQdKrjoqBm3zkiJLeyJor/iWqz3QqZnNcv1M X-Received: by 10.180.79.106 with SMTP id i10mr43460688wix.23.1388663697676; Thu, 02 Jan 2014 03:54:57 -0800 (PST) From: Michal Singer References: <14f4fc780bfa09b8d4930ad3d2058b24@mail.gmail.com> <73b35e5c37867692d40686a32d61c824@mail.gmail.com> <6de6e275d9590efac2818ceaaaa934bb@mail.gmail.com> <606749a79dc6a072e58ba5e6a3f216f6@mail.gmail.com> In-Reply-To: MIME-Version: 1.0 X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQITZpO8pgbgp62XZB+iFcDe3O9/KgDcqITYApcTtosBNfEbRwKuKJUnAq9YREUBufH3lwHee+htAZlmO1ACfAl17Jla7MIw Date: Thu, 2 Jan 2014 13:54:56 +0200 Message-ID: Subject: RE: Guaranteeing message processing on strom fails To: user@storm.incubator.apache.org Content-Type: multipart/alternative; boundary=f46d0421a63969adf104eefb7659 X-Virus-Checked: Checked by ClamAV on apache.org --f46d0421a63969adf104eefb7659 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable I added the code: Code for JMS Storm Spout example Please note that other than log messages I added it is the same as the example I found in the storm examples. What I don=92t understand is why the AMQ does not resend messages which wer= e not acked. I killed a worker and as a result a message got failed but the AMQ still did not resend. I also don=92t understand the reason for the RecoveryTask in the example. When I worked with AMQ in a different consolation =96 it did resend message= s. Thanks, Michal *From:* Abhishek Bhattacharjee [mailto:abhishek.bhattacharjee11@gmail.com] *Sent:* Thursday, January 02, 2014 1:28 PM *To:* user@storm.incubator.apache.org *Subject:* Re: Guaranteeing message processing on strom fails Can you give a link to your code(only the spout implementation). You can use http://pastebin.com/. On Thu, Jan 2, 2014 at 1:04 PM, Michal Singer wrote: Hi, thanks for this reply. I read now this chapter. I am trying the jms example which does support the ack and fail methods. I am still having issues with lost data. I am checking this to see if I find any problem. Please let me know if anyone knows of issues with this example and guaranteeing messages. Thanks, Michal *From:* Abhishek Bhattacharjee [mailto:abhishek.bhattacharjee11@gmail.com] *Sent:* Tuesday, December 31, 2013 10:35 AM *To:* user@storm.incubator.apache.org *Subject:* Re: Guaranteeing message processing on strom fails Okay so I think what you are doing is using the storm-starter to check whether message processing is guaranteed. But I think you are not replaying the messages in the spout (I could be wrong). Are you changing the fail() method inside the spout ? fail() method must be used to replay the messages by the programmer it is not automatically done by storm. Have a look at this for better understanding. https://github.com/storm-book/examples-ch04-spouts/blob/master/src/main/jav= a/banktransactions/TransactionsSpouts.java And read the 4th chapter from this book http://shop.oreilly.com/product/0636920024835.do And if you are doing it then the uniqueness of the message could be a valid reason. On Tue, Dec 31, 2013 at 12:29 PM, Michal Singer wrote: thanks, I am checking this out. This might be a problem. But there is still this issue: Ui =96 some streams are missing in UI as a result of the worker being kille= d or rebalance *From:* Nathan Leung [mailto:ncleung@gmail.com] *Sent:* Monday, December 30, 2013 6:35 PM *To:* user *Subject:* Re: Guaranteeing message processing on strom fails You are using the sentence as the message ID? The word count example repeats sentences, and your message IDs need to be unique. On Mon, Dec 30, 2013 at 2:05 AM, Michal Singer wrote= : In my test I am using the word counter that was in the code samples of storm starter. The first spout is a the word reader that reads lines from files and sends them to the word normalizer which send the words in the lines to the word counter. The spout emits: List tuple, Object messageId The message id is the line. The word normalizer emits: Collection anchors, List tuple and sends ack at the end of the execute method The word counter sends ack on the input it receives. If I kill a worker, this spout does not use any queue, so it won=92t be resent? Isn=92t this the purpose of the anchors? Do know what to send in case of timeout? Thanks, Michal *From:* Michael Rose [mailto:michael@fullcontact.com] *Sent:* Monday, December 30, 2013 8:55 AM *To:* user@storm.incubator.apache.org *Subject:* Re: Guaranteeing message processing on strom fails What spout are you using? Guarantees must be enforced by spout. For instance, the RabbitMQ spout doesn't ack an AMQP message until ack() is called for that message tag (or rejects if fail() is called). The spout must pass along a unique message identifier to enable this behavior. If a tuple goes missing somewhere along the line, fail() will be called after a timeout. If you kill the acker that tuple was tracked with, it's then up to the message queue or other impl to be able to replay that message. Michael Rose (@Xorlev ) Senior Platform Engineer, FullContact michael@fullcontact.com On Sun, Dec 29, 2013 at 11:39 PM, Michal Singer wrote: But what I see is that some of the tuples are not replayed. I print out all the tuples and some don=92t arrive when I kill a worker process. Thanks, Michal *From:* Michael Rose [mailto:michael@fullcontact.com] *Sent:* Sunday, December 29, 2013 7:26 PM *To:* user@storm.incubator.apache.org *Subject:* Re: Guaranteeing message processing on strom fails You are not guaranteed that tuples make it, only that if they go missing or processing fails it will replayed from the spout "at least once execution" On Dec 29, 2013 4:47 AM, "Michal Singer" wrote: Hi, I am trying to test the guaranteeing of messages on storm: I have two nodes: 1. Node-A contains: supervisor, ui, zookeeper, nimbus 2. Node-B contains: supervisor I have 1 spout and 2 bolts (I am actually running the word counter test) I send about 17 messages. The first bolt is on Node-B The second bolt is divided to 4 executors which are divided between the nodes. I kill the worker on node-B and I expect the worker on the other node to get the data which was supposed to be sent to Node-B. But the worker is raised on node-B and the data is sent there accept for one touple which is missing. 1. According to the ui =96 it is very difficult to see what is going = on cause I guess there are messages resent and it is hard to know exactly what happened. 2. According to my output =96 one message is missing which was suppos= ed to go to node-B which I killed it=92s worker. I defined anchors and I sent acks. So what am I missing? Why is there data loss? Thanks, Michal --=20 *Abhishek Bhattacharjee* *Pune Institute of Computer Technology* --=20 *Abhishek Bhattacharjee* *Pune Institute of Computer Technology* --f46d0421a63969adf104eefb7659 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable

I = added the code: Code for JMS Storm Spout example

Please note that other th= an log messages I added it is the same as the example I found in the storm = examples.

What I don=92t understand= is why the AMQ does not resend messages which were not acked.

I killed a worker and as a result a message got = failed but the AMQ still did not resend.

<= span style=3D"font-size:11.0pt;font-family:"Calibri","sans-s= erif";color:#1f497d">I also don=92t understand the reason for the Reco= veryTask=A0 in the example.

When I worked with AMQ in a different consolation =96 it did = resend messages.

=A0

Thanks, Michal

=A0

=A0

From: Abhishek= Bhattacharjee [mailto:abhishek.bhattacharjee11@gmail.com]
Sent: Thursday, January 02, 2014 1:28 PM
To: user@storm.incubator.apache.orgSubject: Re: Guaranteeing message processing on strom fails
=

=A0

Can you give a li= nk to your code(only the spout implementation). You can use=A0http://pastebin.com/.

=A0

On Thu, Jan 2, 2014 at 1:04 PM, Michal S= inger <michal@= leadspace.com> wrote:

Hi, thanks for this reply.

I read now this cha= pter. I am trying the jms example which does support the ack and fail metho= ds.

I am still having i= ssues with lost data. I am checking this to see if I find any problem.

=A0

Please let me know if an= yone knows of issues with this example and guaranteeing messages.

Thanks, Michal

=A0

From: Ab= hishek Bhattacharjee [mailto:abhishek.bhattacharjee11@gmail.com]
Sent: Tuesday, December 31, 2013 10:35 AM


To: user@storm.incubator.apache.org
Subject= : Re: Guaranteeing message processing on strom fails

=A0

Okay so I think what you are doing is using the storm-star= ter to check whether message processing is guaranteed.

But I think you are not replaying the messages in the spout (I could be wro= ng). Are you changing the fail() method inside the spout ?

fail() method must be used to replay the message= s by the programmer it is not automatically done by storm.

Have a look at this for better unde= rstanding.

=A0

And read the 4th chapter from this book=A0

=A0

And if you are doing it then the uniqueness of the message coul= d be a valid reason.

=A0

On Tue, Dec 31, 2013 at 12:29 PM, = Michal Singer <michal@leadspace.com> wrote:

thanks, I am checking this out. This might b= e a problem.

But there is still = this issue:

=A0

Ui =96 some streams= are missing in UI as a result of the worker being killed or rebalance

=A0

From: Nathan = Leung [mailto:ncleun= g@gmail.com]
Sent: Monday, December 30, 2013 6:35 PM
To: user


Subject: Re: Guaranteein= g message processing on strom fails

=A0

You are using the sentence as the = message ID? =A0The word count example repeats sentences, and your message I= Ds need to be unique.

=A0

On Mon, Dec 30, 2013 at 2:05 AM, M= ichal Singer <= michal@leadspace.com> wrote:

In my test I am using the word counter that w= as in the code samples of storm starter.

The first spout is = a the word reader that reads lines from files and sends them to the word no= rmalizer which send the words in the lines to the word counter.

The spout emits: Li= st<Object> tuple, Object messageId

The message id is the line.

The word normalizer emits: = Collection<Tuple> anchors, List<Object> tuple and sends ack at = the end of the execute method

The word counter se= nds ack on the input it receives.

=A0

If I kill a worker,= this spout does not use any queue, so it won=92t be resent?

Isn=92t this the purpose of the anchors? Do know= what to send in case of timeout?

Thanks, Michal

=A0

=A0

=A0

=A0

From: Mi= chael Rose [mailto:michael@fullcontact.com]
Sent: Monday, December 30, 2013 8:55 AM


To: user@storm.incubator.apache.org
Sub= ject: Re: Guaranteeing message processing on strom fails

=A0

What spout are you using? Guarantees must be enforced by s= pout. For instance, the RabbitMQ spout doesn't ack an AMQP message unti= l ack() is called for that message tag (or rejects if fail() is called).

=A0

The spout must pass along a unique message identifier to enable this = behavior.

=A0

If a tuple goes missing somewhere along the line, fail() will be called aft= er a timeout. If you kill the acker that tuple was tracked with, it's t= hen up to the message queue or other impl to be able to replay that message= .


Michael Rose (@Xorlev)
Senior Platform Engineer,=A0FullContact
michael@fullco= ntact.com

=A0

On Sun, Dec 29, 2013 at 11= :39 PM, Michal Singer <michal@leadspace.com> wrote:

But what = I see is that some of the tuples are not replayed. I print out all the tupl= es and some don=92t arrive when I kill a worker process.

Thanks, Michal

=A0

From: Mi= chael Rose [mailto:michael@fullcontact.com]
Sent: Sunday, December 29, 2013 7:26 PM
To: user@storm.incubator= .apache.org
Subject: Re: Guaranteeing message processing on s= trom fails

=A0

You are not guaranteed tha= t tuples make it, only that if they go missing or processing fails it will = replayed from the spout "at least once execution"

On Dec 29, 2013 4:47 AM, "Michal Singer" <michal@leadspace.com> wrote:

Hi, I am trying to test the guaran= teeing of messages on storm:

=A0

I have two= nodes:

1.=A0=A0=A0=A0=A0=A0 N= ode-A contains: supervisor, ui, zookeeper, nimbus

2.=A0=A0=A0=A0=A0=A0 Node-B contains: supervisor

I have 1 spout and 2 bolts (I am actually runn= ing the word counter test)

=A0

I send about 17 messages.

The first bolt is on Node-B

The second bolt is divided to 4 executors whic= h are divided between the nodes.

=A0

I kill the worker on node-B and I expect the work= er on the other node to get the data which was supposed to be sent to Node-= B.

But the worker is raised on node-B and the dat= a is sent there accept for one touple which is missing.

=A0

1.=A0=A0=A0=A0=A0=A0= According to the ui =96 it is very difficult to see what is going o= n cause I guess there are messages resent and it is hard to know exactly wh= at happened.

2.=A0=A0=A0=A0=A0=A0 According to= my output =96 one message is missing which was supposed to go to node-B wh= ich I killed it=92s worker.

=A0

I defined anchors and I sent acks.

=A0

<= p class=3D"MsoNormal" style>So what am I missing? Why is there data loss?

=A0

Thanks, = Michal

=A0

=A0

=

=A0



=A0

--

Abhi= shek Bhattacharjee

Pune Institute of Computer Technology


=

=A0

--

Abhishek Bhattacharjee=

Pune Institute of= Computer Technology

--f46d0421a63969adf104eefb7659--