storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jeff saremi <>
Subject RE: Choosing where your tasks run in Storm
Date Fri, 04 Jul 2014 23:12:37 GMT
Michael and Andrew. thanks so much
Date: Fri, 4 Jul 2014 14:41:17 -0600
Subject: Re: Choosing where your tasks run in Storm

You can make it happen with a custom scheduler, see this article (sorry for mangling, getting
this link through SpamAssassin on the group was a nightmare):

<http> xumingming <dot> sinaapp <dotcom> <slash> 885/twitter-storm-how-to-develop-a-pluggable-scheduler/

But it's nothing I've seriously attempted before, the existing schedulers are in Clojure.
It's not impossible to do for sure, but like Andrew said it might well just be easier to have
separate clusters that share ZK clusters.

Michael Rose (@Xorlev)
Senior Platform Engineer, FullContact

On Fri, Jul 4, 2014 at 10:28 AM, Andrew Montalenti <> wrote:

I don't think this is possible right now, though I have thought about the same thing before.
It *might* be true that Storm's support for YARN could eventually lead to this kind of thing,
but I don't know much about it. For now, you're best off having separate Storm clusters for
different classes of machines. You could consider putting Kafka queues between them to ensure
cross-topology message reliability guarantees. (e.g. have your I/O bound topology read from
kafka and write to kafka, and have your CPU-bound topology read from the Kafka topic produced
by the first queue).

---Andrew MontalentiCo-Founder & CTO

On Fri, Jul 4, 2014 at 7:59 AM, jeff saremi <> wrote:

I'm wondering if this concept applies to Storm and if there's a way to do this.

I'd like to limit the machines that certain spouts or bolts run on. There are many reasons
for this. But for one let's assume that I have a bolt that is just a proxy for some legacy
service. I want to monitor that service by way of the bolt and use it in my topology.

Another way of looking at it is that I want to have a topology that spans different "classes"
of machines.
Let's say I have 3 classes of machines: small, medium, and large. Some topologies are limited
to only one class of machines however some other topologies need to span two or more classes
of machines.

How can I do this in storm?

View raw message