Return-Path: X-Original-To: apmail-storm-user-archive@minotaur.apache.org Delivered-To: apmail-storm-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EC22B10B2A for ; Thu, 3 Apr 2014 17:53:14 +0000 (UTC) Received: (qmail 52090 invoked by uid 500); 3 Apr 2014 17:53:14 -0000 Delivered-To: apmail-storm-user-archive@storm.apache.org Received: (qmail 51647 invoked by uid 500); 3 Apr 2014 17:53:12 -0000 Mailing-List: contact user-help@storm.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@storm.incubator.apache.org Delivered-To: mailing list user@storm.incubator.apache.org Received: (qmail 51551 invoked by uid 99); 3 Apr 2014 17:53:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Apr 2014 17:53:10 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ncleung@gmail.com designates 209.85.223.181 as permitted sender) Received: from [209.85.223.181] (HELO mail-ie0-f181.google.com) (209.85.223.181) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Apr 2014 17:53:05 +0000 Received: by mail-ie0-f181.google.com with SMTP id tp5so2222964ieb.12 for ; Thu, 03 Apr 2014 10:52:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=d0LOuMZvIgavjJD5TBg8FjuOg36XIAFrNWS4QTF19IM=; b=B0WSB2WLe6AyedBYpEGq1PRdLo08WLwIwK6H6BYA7zyzqLvD2GVtvKwT2zRiZu7IeL yG12C8KCisbMkGd5QfqEkL9OQrgnb/gTNwSevIYz33rhx+kpRmvCmmqs/xz1cyNhKPzI nqPm/1lsqFS0MA78dn5hJatBtzul8134d1BTiqeebrkgJ8qlKA4VnXBWctXj7v25vVM9 SJj29qoijUvYgThVrM5ktt79tJM6ABbsnNFDd6EtVhsDCoW00U5pUY5nzmNBkNx0gCAc SewMV6cC4VB8SPM5CWVJP7Qgfp88cSqmy0aRQOzDq7vew6mizgB/NsZPKWDgO7RGii8e jLxg== MIME-Version: 1.0 X-Received: by 10.50.66.143 with SMTP id f15mr17119970igt.18.1396547562878; Thu, 03 Apr 2014 10:52:42 -0700 (PDT) Received: by 10.64.252.161 with HTTP; Thu, 3 Apr 2014 10:52:42 -0700 (PDT) In-Reply-To: References: <193293c4cc954b70b0b590012ed854c8@DBXPR07MB398.eurprd07.prod.outlook.com> Date: Thu, 3 Apr 2014 13:52:42 -0400 Message-ID: Subject: Re: Basic storm question From: Nathan Leung To: user Content-Type: multipart/alternative; boundary=047d7bd6c74465bbc304f627113a X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd6c74465bbc304f627113a Content-Type: text/plain; charset=ISO-8859-1 tasks are run serially by the executor. On Thu, Apr 3, 2014 at 1:42 PM, Huiliang Zhang wrote: > Thanks. But how the multiple tasks are executed inside a single executor > thread? in sequential order one by one or the executor thread spawns new > threads for each tasks? > > > On Thu, Apr 3, 2014 at 10:34 AM, Nathan Leung wrote: > >> by default each task is executed by 1 executor, but if the number of >> tasks is greater than the number of executors, then each executor (thread) >> will execute more than one task. Note that when rebalancing a topology, >> you can change the number of executors and the number of workers, but not >> the number of tasks. >> >> >> On Thu, Apr 3, 2014 at 1:31 PM, Huiliang Zhang wrote: >> >>> >>> http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/ is >>> a very good article about the running of topology. I have another question: >>> >>> Since executor is in fact thread in the worker process, what's task >>> inside an executor thread? We can see that there may be several tasks for a >>> same component inside a single executor thread. How will multiple tasks be >>> executed inside the executor thread? >>> >>> >>> On Wed, Apr 2, 2014 at 9:25 PM, padma priya chitturi < >>> padmapriya30@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> This is how you should run nimbus/supervisor: >>>> >>>> /bin$./storm nimbus >>>> /bin$./storm supervisor >>>> >>>> >>>> On Wed, Apr 2, 2014 at 11:42 PM, Leonardo Bohac < >>>> leonardo.bohac@gmail.com> wrote: >>>> >>>>> Hello, I've downloaded the last version of storm at >>>>> http://storm.incubator.apache.org/downloads.html and when I try to do >>>>> the */bin/storm nimbus* command I get the following message: >>>>> >>>>> *The storm client can only be run from within a release. You appear to >>>>> be trying to run the client from a checkout of Storm's source code.* >>>>> >>>>> *You can download a Storm release >>>>> at http://storm-project.net/downloads.html >>>>> * >>>>> >>>>> >>>>> >>>>> I don't know whats missing... >>>>> >>>>> >>>>> Thanks! >>>>> >>>>> >>>>> 2014-04-02 15:05 GMT-03:00 Nathan Leung : >>>>> >>>>> No, it creates an extra executor to deal with processing the ack >>>>>> messages that are sent by the bolts after processing tuples. See the >>>>>> following for details on how acking works in storm: >>>>>> https://github.com/nathanmarz/storm/wiki/Guaranteeing-message-processing. >>>>>> By default storm will create 1 acker per worker you have in your topology. >>>>>> >>>>>> >>>>>> On Wed, Apr 2, 2014 at 2:01 PM, Huiliang Zhang wrote: >>>>>> >>>>>>> Hi Nathan, >>>>>>> >>>>>>> The last bolt just emits the tuples and no more bolt in the topology >>>>>>> will consume and ack the tuples. Do you mean that storm automatically >>>>>>> creates an extra executor to deal with the tuples? >>>>>>> >>>>>>> Thanks, >>>>>>> Huiliang >>>>>>> >>>>>>> >>>>>>> On Wed, Apr 2, 2014 at 8:31 AM, Nathan Leung wrote: >>>>>>> >>>>>>>> the extra task/executor is the acker thread. >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Apr 1, 2014 at 9:23 PM, Huiliang Zhang wrote: >>>>>>>> >>>>>>>>> I just submitted ExclamationTopology for testing. >>>>>>>>> >>>>>>>>> builder.setSpout("word", new TestWordSpout(), 10); >>>>>>>>> >>>>>>>>> builder.setBolt("exclaim1", new ExclamationBolt(), >>>>>>>>> 3).shuffleGrouping("word"); >>>>>>>>> >>>>>>>>> builder.setBolt("exclaim2", new ExclamationBolt(), >>>>>>>>> 2).shuffleGrouping("exclaim1"); >>>>>>>>> >>>>>>>>> I am supposed to see 15 executors. However, I see 16 executors and >>>>>>>>> 16 tasks on topology summary on storm UI. The numbers of executors are >>>>>>>>> correct for the specific spout and bolts and aggregate to 15. Is that a bug >>>>>>>>> in displaying topology summary? >>>>>>>>> >>>>>>>>> My cluster consists of 2 supervisors and each has 4 workers >>>>>>>>> defined. >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Apr 1, 2014 at 1:43 PM, Nathan Leung wrote: >>>>>>>>> >>>>>>>>>> By default supervisor nodes can run up to 4 workers. This is >>>>>>>>>> configurable in storm.yaml (for example see supervisor.slots.ports here: >>>>>>>>>> https://github.com/nathanmarz/storm/blob/master/conf/defaults.yaml). >>>>>>>>>> Memory should be split between the workers. It's a typical Java heap, so >>>>>>>>>> anything running on that worker process shares the heap. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Apr 1, 2014 at 4:10 PM, David Crossland < >>>>>>>>>> david@elastacloud.com> wrote: >>>>>>>>>> >>>>>>>>>>> On said subject, how does memory allocation work I these >>>>>>>>>>> cases? Assuming 1 worker per node would you just dump all the memory >>>>>>>>>>> available into worker.childopts? I guess the memory pool would be shared >>>>>>>>>>> between the spawned threads as appropriate to their needs? >>>>>>>>>>> >>>>>>>>>>> I'm assuming the equivalent options for supervisor/nimbus are >>>>>>>>>>> fine left at defaults. Given that the workers/spouts/bolts are the working >>>>>>>>>>> parts of the topology these would where I should target available memory? >>>>>>>>>>> >>>>>>>>>>> D >>>>>>>>>>> >>>>>>>>>>> *From:* Huiliang Zhang >>>>>>>>>>> *Sent:* Tuesday, 1 April 2014 19:47 >>>>>>>>>>> *To:* user@storm.incubator.apache.org >>>>>>>>>>> >>>>>>>>>>> Thanks. It should be good if there exist some example figures >>>>>>>>>>> explaining the relationship between tasks, workers, and threads. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Mar 29, 2014 at 6:34 AM, Susheel Kumar Gadalay < >>>>>>>>>>> skgadalay@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> No, a single worker is dedicated to a single topology no matter >>>>>>>>>>>> how >>>>>>>>>>>> many threads it spawns for different bolts/spouts. >>>>>>>>>>>> A single worker cannot be shared across multiple topologies. >>>>>>>>>>>> >>>>>>>>>>>> On 3/29/14, Nathan Leung wrote: >>>>>>>>>>>> > From what I have seen, the second topology is run with 1 >>>>>>>>>>>> worker until you >>>>>>>>>>>> > kill the first topology or add more worker slots to your >>>>>>>>>>>> cluster. >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > On Sat, Mar 29, 2014 at 2:57 AM, Huiliang Zhang < >>>>>>>>>>>> zhlntu@gmail.com> wrote: >>>>>>>>>>>> > >>>>>>>>>>>> >> Thanks. I am still not clear. >>>>>>>>>>>> >> >>>>>>>>>>>> >> Do you mean that in a single worker process, there will be >>>>>>>>>>>> multiple >>>>>>>>>>>> >> threads and each thread will handle part of a topology? If >>>>>>>>>>>> so, what does >>>>>>>>>>>> >> the number of workers mean when submitting topology? >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> On Fri, Mar 28, 2014 at 11:18 PM, padma priya chitturi < >>>>>>>>>>>> >> padmapriya30@gmail.com> wrote: >>>>>>>>>>>> >> >>>>>>>>>>>> >>> Hi, >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> No, its not the case. No matter how many topologies you >>>>>>>>>>>> submit, the >>>>>>>>>>>> >>> workers will be shared among the topologies. >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> Thanks, >>>>>>>>>>>> >>> Padma Ch >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> On Sat, Mar 29, 2014 at 5:11 AM, Huiliang Zhang < >>>>>>>>>>>> zhlntu@gmail.com> >>>>>>>>>>>> >>> wrote: >>>>>>>>>>>> >>> >>>>>>>>>>>> >>>> Hi, >>>>>>>>>>>> >>>> >>>>>>>>>>>> >>>> I have a simple question about storm. >>>>>>>>>>>> >>>> >>>>>>>>>>>> >>>> My cluster has just 1 supervisor and 4 ports are defined >>>>>>>>>>>> to run 4 >>>>>>>>>>>> >>>> workers. I first submit a topology which needs 3 workers. >>>>>>>>>>>> Then I submit >>>>>>>>>>>> >>>> another topology which needs 2 workers. Does this mean >>>>>>>>>>>> that the 2nd >>>>>>>>>>>> >>>> topology will never be run? >>>>>>>>>>>> >>>> >>>>>>>>>>>> >>>> Thanks, >>>>>>>>>>>> >>>> Huiliang >>>>>>>>>>>> >>>> >>>>>>>>>>>> >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> >> >>>>>>>>>>>> > >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > --047d7bd6c74465bbc304f627113a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
tasks are run serially by the executor.


On Thu, Apr 3, 2014 at= 1:42 PM, Huiliang Zhang <zhlntu@gmail.com> wrote:
Thanks. But how the multipl= e tasks are executed inside a single executor thread? in sequential order o= ne by one or the executor thread spawns new threads for each tasks?


On Thu, Apr 3, 2014 at 10:34 AM, Nathan Leung <= ncleung@gmail.com> wrote:
by default each task is executed by 1 executor, but if the= number of tasks is greater than the number of executors, then each executo= r (thread) will execute more than one task. =A0Note that when rebalancing a= topology, you can change the number of executors and the number of workers= , but not the number of tasks.


On Thu, Apr 3= , 2014 at 1:31 PM, Huiliang Zhang <zhlntu@gmail.com> wrote:
http://www.michael-noll.com/b= log/2012/10/16/understanding-the-parallelism-of-a-storm-topology/=A0is = a very good article about the running of topology. I have another question:=

Since executor is in fact thread in the worker process, what= 's task inside an executor thread? We can see that there may be several= tasks for a same component inside a single executor thread. How will multi= ple tasks be executed inside the executor thread?


On Wed, Apr 2, 2014 at 9:25 PM, padma priya chitturi &= lt;padmapriya30= @gmail.com> wrote:
Hi,

This= is how you should run nimbus/supervisor:

/bin$./s= torm nimbus
/bin$./storm supervisor


On Wed, Apr 2, 2014 at 11:42 PM, Leonardo Bohac <leonardo.bohac@gma= il.com> wrote:
Hello, I've downloaded the last version of storm at=A0http://storm.incubator.a= pache.org/downloads.html=A0and when I try to do the=A0/bin/storm nimbus=A0command I get the following messag= e:

<= b>The storm client can only be run from within a release. You appear to be = trying to run the client from a checkout of Storm's source code.

You can download a Storm release at=A0http://storm= -project.net/downloads.html



I don't know whats missing...


Thanks!


2014-04-02 15:05 GMT-03:00 Nathan Leung <ncleung@gmail.com>:

No, it creates an extra exe= cutor to deal with processing the ack messages that are sent by the bolts a= fter processing tuples. =A0See the following for details on how acking work= s in storm:=A0https://github.com/nathanmarz/sto= rm/wiki/Guaranteeing-message-processing. =A0By default storm will creat= e 1 acker per worker you have in your topology.


On Wed, Apr 2= , 2014 at 2:01 PM, Huiliang Zhang <zhlntu@gmail.com> wrote:
Hi Nathan,

The last bolt just emi= ts the tuples and no more bolt in the topology will consume and ack the tup= les. Do you mean that storm automatically creates an extra executor to deal= with the tuples?

Thanks,
Huiliang


On Wed, Apr 2, 2014 at = 8:31 AM, Nathan Leung <ncleung@gmail.com> wrote:
the extra task/executor is = the acker thread.


On Tue, Apr 1, 2014 at 9:23 PM, Huiliang= Zhang <zhlntu@gmail.com> wrote:
I just submitted=A0Exclamat= ionTopology for testing.

=A0 =A0=A0builder.setSpout("word", new TestWordSpout(), 10);

=A0 =A0 builder.setBolt("exclaim1", new ExclamationBolt(), 3).shuffleGrouping("word");

=A0 =A0 builder.setBolt("exclaim2", new ExclamationBolt(), 2).shuffleGrouping("exclaim1"= );

I am supposed to see 15 executors. However, I see 16 executors and= 16 tasks on topology summary on storm UI. The numbers of executors are cor= rect for the specific spout and bolts and aggregate to 15. Is that a bug in= displaying topology summary?=A0

My cluster consists of 2 supervisors and each has 4 workers defined.

=

Thanks.



On Tue, Apr 1, 2014 at 1:43 PM, Nathan Leung <ncleu= ng@gmail.com> wrote:
By default supervisor nodes= can run up to 4 workers. =A0This is configurable in storm.yaml (for exampl= e see supervisor.slots.ports here:=A0https://github.c= om/nathanmarz/storm/blob/master/conf/defaults.yaml). =A0Memory should b= e split between the workers. =A0It's a typical Java heap, so anything r= unning on that worker process shares the heap.


On Tue, = Apr 1, 2014 at 4:10 PM, David Crossland <david@elastacloud.com>= wrote:
On said subject, how does memory allocation work I these cases? Assumi= ng 1 worker per node would you just dump all the memory available into work= er.childopts? I guess the memory pool would be shared between the spawned t= hreads as appropriate to their needs?

I'm assuming the equivalent options=A0for supervisor/nimbus are fi= ne left at defaults.=A0 Given that the workers/spouts/bolts are the working= parts of the topology these would where I should target available memory?<= br>

D

From:= =A0Huiliang Zhang
Sent:=A0Tuesday, 1 April 2014 19:47
To:=A0user@storm.incubator.apache.org

Thanks. It should be good if there exist some example figu= res explaining the relationship between tasks, workers, and threads.


On Sat, Mar 29, 2014 at 6:34 AM, Susheel Kumar G= adalay <skgadalay@gmai= l.com> wrote:
No, a single worker is dedicated to a single topology no matter how
many threads it spawns for different bolts/spouts.
A single worker cannot be shared across multiple topologies.

On 3/29/14, Nathan Leung <ncleung@gmail.com> wrote:
> From what I have seen, the second topology is run with 1 worker until = you
> kill the first topology or add more worker slots to your cluster.
>
>
> On Sat, Mar 29, 2014 at 2:57 AM, Huiliang Zhang <zhlntu@gmail.com> wrote:
>
>> Thanks. I am still not clear.
>>
>> Do you mean that in a single worker process, there will be multipl= e
>> threads and each thread will handle part of a topology? If so, wha= t does
>> the number of workers mean when submitting topology?
>>
>>
>> On Fri, Mar 28, 2014 at 11:18 PM, padma priya chitturi <
>> padmap= riya30@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> No, its not the case. No matter how many topologies you submit= , the
>>> workers will be shared among the topologies.
>>>
>>> Thanks,
>>> Padma Ch
>>>
>>>
>>> On Sat, Mar 29, 2014 at 5:11 AM, Huiliang Zhang <zhlntu@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a simple question about storm.
>>>>
>>>> My cluster has just 1 supervisor and 4 ports are defined t= o run 4
>>>> workers. I first submit a topology which needs 3 workers. = Then I submit
>>>> another topology which needs 2 workers. Does this mean tha= t the 2nd
>>>> topology will never be run?
>>>>
>>>> Thanks,
>>>> Huiliang
>>>>
>>>
>>>
>>
>












--047d7bd6c74465bbc304f627113a--