storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Leung <ncle...@gmail.com>
Subject Re: Basic storm question
Date Thu, 03 Apr 2014 18:34:34 GMT
it allows you to rebalance to more executors if you increase cluster size
in the future.  Number of tasks cannot be changed because certain features
(e.g. fieldsGrouping) would be extremely difficult to do properly if number
of tasks can change.


On Thu, Apr 3, 2014 at 2:31 PM, Huiliang Zhang <zhlntu@gmail.com> wrote:

> Thanks. Since inside an executor, multiple tasks are in fact for the same
> spout or bolt, is this feature of multiple tasks only useful for some
> special cases?
>
>
> On Thu, Apr 3, 2014 at 10:52 AM, Nathan Leung <ncleung@gmail.com> wrote:
>
>> tasks are run serially by the executor.
>>
>>
>> On Thu, Apr 3, 2014 at 1:42 PM, Huiliang Zhang <zhlntu@gmail.com> wrote:
>>
>>> Thanks. But how the multiple tasks are executed inside a single executor
>>> thread? in sequential order one by one or the executor thread spawns new
>>> threads for each tasks?
>>>
>>>
>>> On Thu, Apr 3, 2014 at 10:34 AM, Nathan Leung <ncleung@gmail.com> wrote:
>>>
>>>> by default each task is executed by 1 executor, but if the number of
>>>> tasks is greater than the number of executors, then each executor (thread)
>>>> will execute more than one task.  Note that when rebalancing a topology,
>>>> you can change the number of executors and the number of workers, but not
>>>> the number of tasks.
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 1:31 PM, Huiliang Zhang <zhlntu@gmail.com>wrote:
>>>>
>>>>>
>>>>> http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/
is
>>>>> a very good article about the running of topology. I have another question:
>>>>>
>>>>> Since executor is in fact thread in the worker process, what's task
>>>>> inside an executor thread? We can see that there may be several tasks
for a
>>>>> same component inside a single executor thread. How will multiple tasks
be
>>>>> executed inside the executor thread?
>>>>>
>>>>>
>>>>> On Wed, Apr 2, 2014 at 9:25 PM, padma priya chitturi <
>>>>> padmapriya30@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This is how you should run nimbus/supervisor:
>>>>>>
>>>>>> /bin$./storm nimbus
>>>>>> /bin$./storm supervisor
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 2, 2014 at 11:42 PM, Leonardo Bohac <
>>>>>> leonardo.bohac@gmail.com> wrote:
>>>>>>
>>>>>>> Hello, I've downloaded the last version of storm at
>>>>>>> http://storm.incubator.apache.org/downloads.html and when I try
to
>>>>>>> do the */bin/storm nimbus* command I get the following message:
>>>>>>>
>>>>>>> *The storm client can only be run from within a release. You
appear
>>>>>>> to be trying to run the client from a checkout of Storm's source
code.*
>>>>>>>
>>>>>>> *You can download a Storm release
>>>>>>> at http://storm-project.net/downloads.html
>>>>>>> <http://storm-project.net/downloads.html>*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  I don't know whats missing...
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>>
>>>>>>> 2014-04-02 15:05 GMT-03:00 Nathan Leung <ncleung@gmail.com>:
>>>>>>>
>>>>>>> No, it creates an extra executor to deal with processing the
ack
>>>>>>>> messages that are sent by the bolts after processing tuples.
 See the
>>>>>>>> following for details on how acking works in storm:
>>>>>>>> https://github.com/nathanmarz/storm/wiki/Guaranteeing-message-processing.
>>>>>>>>  By default storm will create 1 acker per worker you have
in your topology.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Apr 2, 2014 at 2:01 PM, Huiliang Zhang <zhlntu@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> Hi Nathan,
>>>>>>>>>
>>>>>>>>> The last bolt just emits the tuples and no more bolt
in the
>>>>>>>>> topology will consume and ack the tuples. Do you mean
that storm
>>>>>>>>> automatically creates an extra executor to deal with
the tuples?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Huiliang
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Apr 2, 2014 at 8:31 AM, Nathan Leung <ncleung@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>> the extra task/executor is the acker thread.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Apr 1, 2014 at 9:23 PM, Huiliang Zhang <zhlntu@gmail.com>wrote:
>>>>>>>>>>
>>>>>>>>>>> I just submitted ExclamationTopology for testing.
>>>>>>>>>>>
>>>>>>>>>>>     builder.setSpout("word", new TestWordSpout(),
10);
>>>>>>>>>>>
>>>>>>>>>>>     builder.setBolt("exclaim1", new ExclamationBolt(),
>>>>>>>>>>> 3).shuffleGrouping("word");
>>>>>>>>>>>
>>>>>>>>>>>     builder.setBolt("exclaim2", new ExclamationBolt(),
>>>>>>>>>>> 2).shuffleGrouping("exclaim1");
>>>>>>>>>>>
>>>>>>>>>>> I am supposed to see 15 executors. However, I
see 16 executors
>>>>>>>>>>> and 16 tasks on topology summary on storm UI.
The numbers of executors are
>>>>>>>>>>> correct for the specific spout and bolts and
aggregate to 15. Is that a bug
>>>>>>>>>>> in displaying topology summary?
>>>>>>>>>>>
>>>>>>>>>>> My cluster consists of 2 supervisors and each
has 4 workers
>>>>>>>>>>> defined.
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Apr 1, 2014 at 1:43 PM, Nathan Leung
<ncleung@gmail.com>wrote:
>>>>>>>>>>>
>>>>>>>>>>>> By default supervisor nodes can run up to
4 workers.  This is
>>>>>>>>>>>> configurable in storm.yaml (for example see
supervisor.slots.ports here:
>>>>>>>>>>>> https://github.com/nathanmarz/storm/blob/master/conf/defaults.yaml).
>>>>>>>>>>>>  Memory should be split between the workers.
 It's a typical Java heap, so
>>>>>>>>>>>> anything running on that worker process shares
the heap.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Apr 1, 2014 at 4:10 PM, David Crossland
<
>>>>>>>>>>>> david@elastacloud.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>  On said subject, how does memory allocation
work I these
>>>>>>>>>>>>> cases? Assuming 1 worker per node would
you just dump all the memory
>>>>>>>>>>>>> available into worker.childopts? I guess
the memory pool would be shared
>>>>>>>>>>>>> between the spawned threads as appropriate
to their needs?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  I'm assuming the equivalent options
for supervisor/nimbus
>>>>>>>>>>>>> are fine left at defaults.  Given that
the workers/spouts/bolts are the
>>>>>>>>>>>>> working parts of the topology these would
where I should target available
>>>>>>>>>>>>> memory?
>>>>>>>>>>>>>
>>>>>>>>>>>>>  D
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *From:* Huiliang Zhang <zhlntu@gmail.com>
>>>>>>>>>>>>> *Sent:* Tuesday, 1 April 2014 19:47
>>>>>>>>>>>>> *To:* user@storm.incubator.apache.org
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Thanks. It should be good if there exist
some example
>>>>>>>>>>>>> figures explaining the relationship between
tasks, workers, and threads.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Mar 29, 2014 at 6:34 AM, Susheel
Kumar Gadalay <
>>>>>>>>>>>>> skgadalay@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> No, a single worker is dedicated
to a single topology no
>>>>>>>>>>>>>> matter how
>>>>>>>>>>>>>> many threads it spawns for different
bolts/spouts.
>>>>>>>>>>>>>> A single worker cannot be shared
across multiple topologies.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/29/14, Nathan Leung <ncleung@gmail.com>
wrote:
>>>>>>>>>>>>>> > From what I have seen, the second
topology is run with 1
>>>>>>>>>>>>>> worker until you
>>>>>>>>>>>>>> > kill the first topology or add
more worker slots to your
>>>>>>>>>>>>>> cluster.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > On Sat, Mar 29, 2014 at 2:57
AM, Huiliang Zhang <
>>>>>>>>>>>>>> zhlntu@gmail.com> wrote:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >> Thanks. I am still not clear.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Do you mean that in a single
worker process, there will be
>>>>>>>>>>>>>> multiple
>>>>>>>>>>>>>> >> threads and each thread
will handle part of a topology? If
>>>>>>>>>>>>>> so, what does
>>>>>>>>>>>>>> >> the number of workers mean
when submitting topology?
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> On Fri, Mar 28, 2014 at
11:18 PM, padma priya chitturi <
>>>>>>>>>>>>>> >> padmapriya30@gmail.com>
wrote:
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>> Hi,
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> No, its not the case.
No matter how many topologies you
>>>>>>>>>>>>>> submit, the
>>>>>>>>>>>>>> >>> workers will be shared
among the topologies.
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> Thanks,
>>>>>>>>>>>>>> >>> Padma Ch
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> On Sat, Mar 29, 2014
at 5:11 AM, Huiliang Zhang <
>>>>>>>>>>>>>> zhlntu@gmail.com>
>>>>>>>>>>>>>> >>> wrote:
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>>> Hi,
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> I have a simple
question about storm.
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> My cluster has just
1 supervisor and 4 ports are defined
>>>>>>>>>>>>>> to run 4
>>>>>>>>>>>>>> >>>> workers. I first
submit a topology which needs 3
>>>>>>>>>>>>>> workers. Then I submit
>>>>>>>>>>>>>> >>>> another topology
which needs 2 workers. Does this mean
>>>>>>>>>>>>>> that the 2nd
>>>>>>>>>>>>>> >>>> topology will never
be run?
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> Thanks,
>>>>>>>>>>>>>> >>>> Huiliang
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message