Mailing-List: contact user-help@storm.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@storm.incubator.apache.org
Received-SPF: pass (nike.apache.org: domain of tomas.mazukna@gmail.com
 designates 209.85.223.170 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CABkE0xkCzdpFrmxrn1Me7xR7N9ugcW7pTRnTwvd71eRfqFyT4w@mail.gmail.com>
References: 
 <CABkE0xn3hrPQcG9su7CL=FEXDKHLUBWiWxJbtpAwVXJva-Dk4w@mail.gmail.com>
	<CAMbC8iW9RK-VmktJ4OKjmXZzFLFJYN265uh9m-r2Kvd+UjMxUg@mail.gmail.com>
	<CABkE0xn9fFEf_n_SVju=3xS1N_1PhLKDou2N0h_+0zQ71Hk8_g@mail.gmail.com>
	<CAJgRgovjuCjnudJX98EKK+tKPiNDK=ubeqzf5WrCoBzby1MVsA@mail.gmail.com>
	<CABkE0xm-fGWYOJt2deyTiiPPzt_dA8jzzsL1=KgyLGT-KYGMNw@mail.gmail.com>
	<CAJgRgouyFcgw9NM9T9vV9K7XVcc7h6JMakDBywVf+zFGqq8O2w@mail.gmail.com>
	<CABkE0xmj6r=wC7dT7Jsabv+1c3SKUkJivHd=kTm5cS_D=aqDVg@mail.gmail.com>
	<CAMbC8iWezYGZzjGvVe0Ro=bWwAD1byd6+2scZj3cpo5EO2knTA@mail.gmail.com>
	<CABkE0xkCzdpFrmxrn1Me7xR7N9ugcW7pTRnTwvd71eRfqFyT4w@mail.gmail.com>
Date: Wed, 16 Jul 2014 21:21:23 -0400
Message-ID: 
 <CAMbC8iUwB-aqUBumeswsFe9LWS6T4Twm_CwTRymKXzjEKeqRvw@mail.gmail.com>
Subject: Re: Distribute Spout output among all bolts
From: Tomas Mazukna <tomas.mazukna@gmail.com>
To: user@storm.incubator.apache.org
Content-Type: multipart/alternative; boundary=90e6ba613f647df90104fe59750e

--90e6ba613f647df90104fe59750e
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Kafka client handles that, it is stored in zookeeper with the offset.
I wrote a kafka spout based on kafka groups consumer api. Kafka allows only
one consumer per partition per group.


On Wed, Jul 16, 2014 at 8:41 PM, Andrew Xor <andreas.grammenos@gmail.com>
wrote:

> Ok, but upon runtime how to you set in the spout which kafka partition to
> subscribe at?
>
> Kindly yours,
>
> Andrew Grammenos
>
> -- PGP PKey --
> =E2=80=8B <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
> https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt
>
>
> On Thu, Jul 17, 2014 at 3:30 AM, Tomas Mazukna <tomas.mazukna@gmail.com>
> wrote:
>
>> So you want to define only one instance of the spout that reads the file=
.
>> Number of bolts will only depend on how fast you need to process the dat=
a.
>> I have a topology that has a spout with parallelism of 40 - connected to
>> 40 partitions of a kafka topic. It send traffic to the first bolt which =
has
>> parallelism 320. The whole topology is split up into 4 workers. that mak=
es
>> 10 spout instances in each jvm, feeding 80 bolts. In my case I have
>> grouping so tuples get routed to different physical machines.
>>
>> Tomas
>>
>>
>> On Wed, Jul 16, 2014 at 8:10 PM, Andrew Xor <andreas.grammenos@gmail.com=
>
>> wrote:
>>
>>>  Michael,
>>>
>>> =E2=80=8B Thanks for the response but I think another problem arises; a=
s =E2=80=8BI just
>>> cooked up a small example the increased number of workers only spawns
>>> mirrors of the topology. This poses a problem for me due to the fact th=
at
>>> my spout reads from a very big file and converts each line into a tuple=
 and
>>> feeds that in the topology. What I wanted to do in the first place is t=
o
>>> actually send each tuple produced to a different subscribed bolt each t=
ime
>>> (using Round Robing or smth) so that each one of them got 1/n nth (wher=
e n
>>> the number of bolts) of the input stream. If I spawn 2 workers both wil=
l
>>> read the same file and emit the same tuples so both topology workers wi=
ll
>>> produce the same results.
>>>
>>>  I wanted to avoid to create a spout that takes a file offset as an
>>> input and wire a lot more stuff than I have to; so I was trying to find=
 a
>>> way to perform what I told you in an elegant and scalable fashion...so =
far
>>> I have found nil.
>>>
>>>
>>> On Thu, Jul 17, 2014 at 2:57 AM, Michael Rose <michael@fullcontact.com>
>>> wrote:
>>>
>>>> It doesn't say so, but if you have 4 workers, the 4 executors will be
>>>> shared evenly over the 4 workers. Likewise, 16 will partition 4 each. =
The
>>>> only case where a worker will not get a specific executor is when ther=
e are
>>>> less executors than workers (e.g. 8 workers, 4 executors), 4 of the wo=
rkers
>>>> will receive an executor but the others will not.
>>>>
>>>> It sounds like for your case, shuffle+parallelism is more than
>>>> sufficient.
>>>>
>>>> Michael Rose (@Xorlev <https://twitter.com/xorlev>)
>>>> Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
>>>> michael@fullcontact.com
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 5:53 PM, Andrew Xor <
>>>> andreas.grammenos@gmail.com> wrote:
>>>>
>>>>> Hey Stephen, Michael,
>>>>>
>>>>>  Yea I feared as much... as searching the docs and API did not surfac=
e
>>>>> any reliable and elegant way of doing that unless you had a "RouterBo=
lt".
>>>>> If setting the parallelism of a component is enough for load balancin=
g the
>>>>> processes across different machines that are part of the Storm cluste=
r then
>>>>> this would suffice in my use case. Although here
>>>>> <https://storm.incubator.apache.org/documentation/Understanding-the-p=
arallelism-of-a-Storm-topology.html>
>>>>> the documentation says executors are threads and it does not explicit=
ly say
>>>>> anywhere that threads are spawned across different nodes of the clust=
er...
>>>>> I want to avoid the possibility of these threads only spawning locall=
y and
>>>>> not in a distributed fashion among the cluster nodes..
>>>>>
>>>>> Andrew.
>>>>>
>>>>>
>>>>> On Thu, Jul 17, 2014 at 2:46 AM, Michael Rose <michael@fullcontact.co=
m
>>>>> > wrote:
>>>>>
>>>>>> Maybe we can help with your topology design if you let us know what
>>>>>> you're doing that requires you to shuffle half of the whole stream o=
utput
>>>>>> to each of the two different types of bolts.
>>>>>>
>>>>>> If bolt b1 and bolt b2 are both instances of ExampleBolt (and not tw=
o
>>>>>> different types) as above, there's no point to doing this. Setting t=
he
>>>>>> parallelism will make sure that data is partitioned across machines =
(by
>>>>>> default, setting parallelism sets tasks =3D executors =3D parallelis=
m).
>>>>>>
>>>>>> Unfortunately, I don't know of any way to do this other than
>>>>>> shuffling the output to a new bolt, e.g. bolt "b0" a 'RouterBolt', t=
hen
>>>>>> having bolt b0 round-robin the received tuples between two streams, =
then
>>>>>> have b1 and b2 shuffle over those streams instead.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Michael Rose (@Xorlev <https://twitter.com/xorlev>)
>>>>>> Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
>>>>>> michael@fullcontact.com
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 5:40 PM, Andrew Xor <
>>>>>> andreas.grammenos@gmail.com> wrote:
>>>>>>
>>>>>>> =E2=80=8B
>>>>>>> Hi Tomas,
>>>>>>>
>>>>>>>  As I said in my previous mail the grouping is for a bolt *task* no=
t
>>>>>>> for the actual number of spawned bolts; for example let's say you h=
ave two
>>>>>>> bolts that have a parallelism hint of 3 and these two bolts are wir=
ed to
>>>>>>> the same spout. If you set the bolts as such:
>>>>>>>
>>>>>>> tb.setBolt("b1", new ExampleBolt(), 2 /* p-hint
>>>>>>> */).shuffleGrouping("spout1");
>>>>>>> tb.setBolt("b2", new ExampleBolt(), 2 /* p-hint
>>>>>>> */).shuffleGrouping("spout1");
>>>>>>>
>>>>>>> Then each of the tasks will receive half of the spout tuples but
>>>>>>> each actual spawned bolt will receive all of the tuples emitted fro=
m the
>>>>>>> spout. This is more evident if you set up a counter in the bolt cou=
nting
>>>>>>> how many tuples if has received and testing this with no parallelis=
m hint
>>>>>>> as such:
>>>>>>>
>>>>>>> tb.setBolt("b1", new ExampleBolt(),).shuffleGrouping("spout1");
>>>>>>> tb.setBolt("b2", new ExampleBolt()).shuffleGrouping("spout1");
>>>>>>>
>>>>>>> Now you will see that both bolts will receive all tuples emitted by
>>>>>>> spout1.
>>>>>>>
>>>>>>> Hope this helps.
>>>>>>>
>>>>>>> =E2=80=8B
>>>>>>> =E2=80=8BAndrew.=E2=80=8B
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 17, 2014 at 2:33 AM, Tomas Mazukna <
>>>>>>> tomas.mazukna@gmail.com> wrote:
>>>>>>>
>>>>>>>> Andrew,
>>>>>>>>
>>>>>>>> when you connect your bolt to your spout you specify the grouping.
>>>>>>>> If you use shuffle grouping then any free bolt gets the tuple - in=
 my
>>>>>>>> experience even in lightly loaded topologies the distribution amon=
gst bolts
>>>>>>>> is pretty even. If you use all grouping then all bolts receive a c=
opy of
>>>>>>>> the tuple.
>>>>>>>> Use shuffle grouping and each of your bolts will get about 1/3 of
>>>>>>>> the workload.
>>>>>>>>
>>>>>>>> Tomas
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2014 at 7:05 PM, Andrew Xor <
>>>>>>>> andreas.grammenos@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> H
>>>>>>>>> =E2=80=8Bi,
>>>>>>>>>
>>>>>>>>>  I am trying to distribute the spout output to it's subscribed
>>>>>>>>> bolts evenly; let's say that I have a spout that emits tuples and=
 three
>>>>>>>>> bolts that are subscribed to it. I want each of the three bolts t=
o receive
>>>>>>>>> 1/3 rth of the output (or emit a tuple to each one of these bolts=
 in
>>>>>>>>> turns). Unfortunately as far as I understand all bolts will recei=
ve all of
>>>>>>>>> the emitted tuples of that particular spout regardless of the gro=
uping
>>>>>>>>> defined (as grouping from my understanding is for bolt *tasks* no=
t actual
>>>>>>>>> bolts).
>>>>>>>>>
>>>>>>>>>  I've searched a bit and I can't seem to find a way to accomplish
>>>>>>>>> that...=E2=80=8B is there a way to do that or I am searching in v=
ain?
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Tomas Mazukna
>>>>>>>> 678-557-3834
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Tomas Mazukna
>> 678-557-3834
>>
>
>


--=20
Tomas Mazukna
678-557-3834

--90e6ba613f647df90104fe59750e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Kafka client handles that, it is stored in zookeeper with =
the offset.=C2=A0<div>I wrote a kafka spout based on kafka groups consumer =
api. Kafka allows only one consumer per partition per group.</div></div><di=
v class=3D"gmail_extra">
<br><br><div class=3D"gmail_quote">On Wed, Jul 16, 2014 at 8:41 PM, Andrew =
Xor <span dir=3D"ltr">&lt;<a href=3D"mailto:andreas.grammenos@gmail.com" ta=
rget=3D"_blank">andreas.grammenos@gmail.com</a>&gt;</span> wrote:<br><block=
quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
 solid;padding-left:1ex">
<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:comic sa=
ns ms,sans-serif;font-size:small;color:rgb(0,0,153)">Ok, but upon runtime h=
ow to you set in the spout which kafka partition to subscribe at?<br></div>=
</div>


<div class=3D"gmail_extra"><br clear=3D"all"><div><div dir=3D"ltr"><div><di=
v><font color=3D"#0b5394"><font face=3D"comic sans ms,sans-serif">Kindly yo=
urs,<br><br></font></font></div><font color=3D"#0b5394"><font face=3D"comic=
 sans ms,sans-serif">Andrew Grammenos<br>


<br></font></font></div><font color=3D"#0b5394"><font face=3D"comic sans ms=
,sans-serif">-- PGP PKey --<br></font></font><font color=3D"#0b5394"><font =
face=3D"comic sans ms,sans-serif"><a href=3D"https://www.dropbox.com/s/2kcx=
e59zsi9nrdt/pgpsig.txt" target=3D"_blank">=E2=80=8B</a></font></font><div s=
tyle=3D"font-family:comic sans ms,sans-serif;font-size:small;color:rgb(0,0,=
153)">


<a href=3D"https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt" target=3D"=
_blank">https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt</a><br></div><=
/div></div><div><div class=3D"h5">
<br><br><div class=3D"gmail_quote">On Thu, Jul 17, 2014 at 3:30 AM, Tomas M=
azukna <span dir=3D"ltr">&lt;<a href=3D"mailto:tomas.mazukna@gmail.com" tar=
get=3D"_blank">tomas.mazukna@gmail.com</a>&gt;</span> wrote:<br><blockquote=
 class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc soli=
d;padding-left:1ex">


<div dir=3D"ltr">So you want to define only one instance of the spout that =
reads the file. Number of bolts will only depend on how fast you need to pr=
ocess the data.<div>I have a topology that has a spout with parallelism of =
40 - connected to 40 partitions of a kafka topic. It send traffic to the fi=
rst bolt which has parallelism 320. The whole topology is split up into 4 w=
orkers. that makes 10 spout instances in each jvm, feeding 80 bolts. In my =
case I have grouping so tuples get routed to different physical machines.</=
div>


<div><br></div><div>Tomas</div></div><div class=3D"gmail_extra"><div><div><=
br><br><div class=3D"gmail_quote">On Wed, Jul 16, 2014 at 8:10 PM, Andrew X=
or <span dir=3D"ltr">&lt;<a href=3D"mailto:andreas.grammenos@gmail.com" tar=
get=3D"_blank">andreas.grammenos@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">=C2=A0Michael,<br><div clas=
s=3D"gmail_extra"><br clear=3D"all">
<div class=3D"gmail_default" style=3D"font-family:comic sans ms,sans-serif;=
font-size:small;color:rgb(0,0,153)">=E2=80=8B Thanks for the response but I=
 think another problem arises; as =E2=80=8BI just cooked up a small example=
 the increased number of workers only spawns mirrors of the topology. This =
poses a problem for me due to the fact that my spout reads from a very big =
file and converts each line into a tuple and feeds that in the topology. Wh=
at I wanted to do in the first place is to actually send each tuple produce=
d to a different subscribed bolt each time (using Round Robing or smth) so =
that each one of them got 1/n nth (where n the number of bolts) of the inpu=
t stream. If I spawn 2 workers both will read the same file and emit the sa=
me tuples so both topology workers will produce the same results.<br>


<br></div><div class=3D"gmail_default" style=3D"font-family:comic sans ms,s=
ans-serif;font-size:small;color:rgb(0,0,153)">=C2=A0I wanted to avoid to cr=
eate a spout that takes a file offset as an input and wire a lot more stuff=
 than I have to; so I was trying to find a way to perform what I told you i=
n an elegant and scalable fashion...so far I have found nil.<br>


</div><div><div><br><br><div class=3D"gmail_quote">On Thu, Jul 17, 2014 at =
2:57 AM, Michael Rose <span dir=3D"ltr">&lt;<a href=3D"mailto:michael@fullc=
ontact.com" target=3D"_blank">michael@fullcontact.com</a>&gt;</span> wrote:=
<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">

<div dir=3D"ltr">It doesn&#39;t say so, but if you have 4 workers, the 4 ex=
ecutors will be shared evenly over the 4 workers. Likewise, 16 will partiti=
on 4 each. The only case where a worker will not get a specific executor is=
 when there are less executors than workers (e.g. 8 workers, 4 executors), =
4 of the workers will receive an executor but the others will not.<div>


<br></div><div>It sounds like for your case, shuffle+parallelism is more th=
an sufficient.</div></div><div class=3D"gmail_extra"><div><br clear=3D"all"=
><div><div dir=3D"ltr">


<p>Michael Rose (<a href=3D"https://twitter.com/xorlev" target=3D"_blank">@=
Xorlev</a>)<br>Senior Platform Engineer,=C2=A0<span><a href=3D"http://www.f=
ullcontact.com/" target=3D"_blank">FullContact</a><br></span><a href=3D"mai=
lto:michael@fullcontact.com" target=3D"_blank">michael@fullcontact.com</a><=
/p>


</div></div>
<br><br></div><div><div><div class=3D"gmail_quote">On Wed, Jul 16, 2014 at =
5:53 PM, Andrew Xor <span dir=3D"ltr">&lt;<a href=3D"mailto:andreas.grammen=
os@gmail.com" target=3D"_blank">andreas.grammenos@gmail.com</a>&gt;</span> =
wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr"><div style=3D"font-family:comic sans ms,sans-serif;font-si=
ze:small;color:rgb(0,0,153)">Hey Stephen, Michael,<br><br></div><div style=
=3D"font-family:comic sans ms,sans-serif;font-size:small;color:rgb(0,0,153)=
">


=C2=A0Yea I feared as much... as searching the docs and API did not surface=
 any reliable and elegant way of doing that unless you had a &quot;RouterBo=
lt&quot;. If setting the parallelism of a component is enough for load bala=
ncing the processes across different machines that are part of the Storm cl=
uster then this would suffice in my use case. Although <a href=3D"https://s=
torm.incubator.apache.org/documentation/Understanding-the-parallelism-of-a-=
Storm-topology.html" target=3D"_blank">here</a> the documentation says exec=
utors are threads and it does not explicitly say anywhere that threads are =
spawned across different nodes of the cluster... I want to avoid the possib=
ility of these threads only spawning locally and not in a distributed fashi=
on among the cluster nodes..<span><font color=3D"#888888"><br>


<br></font></span></div><span><font color=3D"#888888"><div style=3D"font-fa=
mily:comic sans ms,sans-serif;font-size:small;color:rgb(0,0,153)">Andrew.<b=
r></div></font></span><div><div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Thu, Jul 1=
7, 2014 at 2:46 AM, Michael Rose <span dir=3D"ltr">&lt;<a href=3D"mailto:mi=
chael@fullcontact.com" target=3D"_blank">michael@fullcontact.com</a>&gt;</s=
pan> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div>Maybe we can help=
 with your topology design if you let us know what you&#39;re doing that re=
quires you to shuffle half of the whole stream output to each of the two di=
fferent types of bolts.</div>


<div>
<br></div><div>If bolt b1 and bolt b2 are both instances of ExampleBolt (an=
d not two different types) as above, there&#39;s no point to doing this. Se=
tting the parallelism will make sure that data is partitioned across machin=
es (by default, setting parallelism sets tasks =3D executors =3D parallelis=
m).</div>


</div><div><br></div>Unfortunately, I don&#39;t know of any way to do this =
other than shuffling the output to a new bolt, e.g. bolt &quot;b0&quot; a &=
#39;RouterBolt&#39;, then having bolt b0 round-robin the received tuples be=
tween two streams, then have b1 and b2 shuffle over those streams instead.<=
div>


<br></div><div><br></div></div><div class=3D"gmail_extra"><br clear=3D"all"=
><div><div dir=3D"ltr">


<p>Michael Rose (<a href=3D"https://twitter.com/xorlev" target=3D"_blank">@=
Xorlev</a>)<br>Senior Platform Engineer,=C2=A0<span><a href=3D"http://www.f=
ullcontact.com/" target=3D"_blank">FullContact</a><br></span><a href=3D"mai=
lto:michael@fullcontact.com" target=3D"_blank">michael@fullcontact.com</a><=
/p>


</div></div><div><div>
<br><br><div class=3D"gmail_quote">On Wed, Jul 16, 2014 at 5:40 PM, Andrew =
Xor <span dir=3D"ltr">&lt;<a href=3D"mailto:andreas.grammenos@gmail.com" ta=
rget=3D"_blank">andreas.grammenos@gmail.com</a>&gt;</span> wrote:<br><block=
quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
 solid;padding-left:1ex">


<div dir=3D"ltr"><div style=3D"font-family:comic sans ms,sans-serif;font-si=
ze:small;color:rgb(0,0,153);display:inline">=E2=80=8B<div style=3D"font-fam=
ily:comic sans ms,sans-serif;font-size:small;color:rgb(0,0,153)">


Hi Tomas,<br><br></div><div style=3D"font-family:comic sans ms,sans-serif;f=
ont-size:small;color:rgb(0,0,153)">=C2=A0As
 I said in my previous mail the grouping is for a bolt *task* not for=20
the actual number of spawned bolts; for example let&#39;s say you have two =
bolts that=20
have a parallelism hint of 3 and these two bolts are wired to the same=20
spout. If you set the bolts as such:<br><br>tb.setBolt(&quot;b1&quot;, new =
ExampleBolt(), 2 /* p-hint */).shuffleGrouping(&quot;spout1&quot;);<br>tb.s=
etBolt(&quot;b2&quot;, new ExampleBolt(), 2 /* p-hint */).shuffleGrouping(&=
quot;spout1&quot;);<br>


<br></div><div style=3D"font-family:comic sans ms,sans-serif;font-size:smal=
l;color:rgb(0,0,153)">Then
 each of the tasks will receive half of the spout tuples but each actual
 spawned bolt will receive all of the tuples emitted from the spout.=20
This is more evident if you set up a counter in the bolt counting how=20
many tuples if has received and testing this with no parallelism hint as
 such:<br><br>tb.setBolt(&quot;b1&quot;, new ExampleBolt(),).shuffleGroupin=
g(&quot;spout1&quot;);<br>tb.setBolt(&quot;b2&quot;, new ExampleBolt()).shu=
ffleGrouping(&quot;spout1&quot;);<br><br></div><div style=3D"font-family:co=
mic sans ms,sans-serif;font-size:small;color:rgb(0,0,153)">


Now you will see that both bolts will receive all tuples emitted by spout1.=
 <br><br></div><div style=3D"font-family:comic sans ms,sans-serif;font-size=
:small;color:rgb(0,0,153)">Hope this helps.<br></div>


<div style=3D"font-family:comic sans ms,sans-serif;font-size:small;color:rg=
b(0,0,153)"><br></div>=E2=80=8B</div><div style=3D"font-family:comic sans m=
s,sans-serif;font-size:small;color:rgb(0,0,153);display:inline">


=E2=80=8BAndrew.=E2=80=8B</div><br><div class=3D"gmail_extra"><br><div clas=
s=3D"gmail_quote">On Thu, Jul 17, 2014 at 2:33 AM, Tomas Mazukna <span dir=
=3D"ltr">&lt;<a href=3D"mailto:tomas.mazukna@gmail.com" target=3D"_blank">t=
omas.mazukna@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Andrew,<=
div><br></div><div>when you connect your bolt to your spout you specify the=
 grouping. If you use shuffle grouping then any free bolt gets the tuple - =
in my experience even in lightly loaded topologies the distribution amongst=
 bolts is pretty even. If you use all grouping then all bolts receive a cop=
y of the tuple.=C2=A0</div>


<div>Use shuffle grouping and each of your bolts will get about 1/3 of the =
workload.</div><div><br></div><div>Tomas</div></div><div class=3D"gmail_ext=
ra"><div><div><br><br><div class=3D"gmail_quote">On Wed, Jul 16, 2014 at 7:=
05 PM, Andrew Xor <span dir=3D"ltr">&lt;<a href=3D"mailto:andreas.grammenos=
@gmail.com" target=3D"_blank">andreas.grammenos@gmail.com</a>&gt;</span> wr=
ote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left:1px solid rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">H<div st=
yle=3D"font-family:comic sans ms,sans-serif;font-size:small;color:rgb(0,0,1=
53);display:inline">


=E2=80=8Bi,<br><br></div><div style=3D"font-family:comic sans ms,sans-serif=
;font-size:small;color:rgb(0,0,153);display:inline">

=C2=A0I am trying to distribute the spout output to it&#39;s subscribed bol=
ts evenly; let&#39;s say that I have a spout that emits tuples and three bo=
lts that are subscribed to it. I want each of the three bolts to receive 1/=
3 rth of the output (or emit a tuple to each one of these bolts in turns). =
Unfortunately as far as I understand all bolts will receive all of the emit=
ted tuples of that particular spout regardless of the grouping defined (as =
grouping from my understanding is for bolt *tasks* not actual bolts).<br>


<br></div><div style=3D"font-family:comic sans ms,sans-serif;font-size:smal=
l;color:rgb(0,0,153);display:inline">=C2=A0I&#39;ve searched a bit and I ca=
n&#39;t seem to find a way to accomplish that...=E2=80=8B is there a way to=
 do that or I am searching in vain?<br>


<br>Thanks.<br></div><div style=3D"font-family:comic sans ms,sans-serif;fon=
t-size:small;color:rgb(0,0,153)"></div></div>
</blockquote></div><br><br clear=3D"all"><span><font color=3D"#888888"><div=
><br></div></font></span></div></div><span><font color=3D"#888888"><span><f=
ont color=3D"#888888">-- <br>Tomas Mazukna<br><a href=3D"tel:678-557-3834" =
value=3D"+16785573834" target=3D"_blank">678-557-3834</a>
</font></span></font></span></div>
</blockquote></div><br></div></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br></div></div></div></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br></div></div></div></div>
</blockquote></div><br><br clear=3D"all"><div><br></div></div></div><span><=
font color=3D"#888888">-- <br>Tomas Mazukna<br><a href=3D"tel:678-557-3834"=
 value=3D"+16785573834" target=3D"_blank">678-557-3834</a>
</font></span></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>Tomas Mazukn=
a<br>678-557-3834
</div>

--90e6ba613f647df90104fe59750e--