storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rui Abreu <rui.ab...@gmail.com>
Subject Re: Nodes underutilized
Date Mon, 14 Sep 2020 14:28:31 GMT
Hi Thomas,

Which version of Storm are you using?

On Sun, 13 Sep 2020 at 20:23, Thomas L. Redman <tomredman@mchsi.com> wrote:

> Sorry, I had previously sent this from a different email address, not sure
> how well that would work with this service, hence this re-send.
>
> I’m running storm on a 3 node cluster, 32 physical cores in each node. I
> have a complex topology with one spout which is a singleton, connected to
> several other bolts most all of which doing natural language processing.
> Most of these are pretty heavy weight. The input spout is easily capable of
> outpacing the downstream bolts. I get good performance, but on only one
> node, even though I specify 3 worker nodes for my topology. StormUI
> indicates for any given component that the executors for that token on the
> idle machines have emitted very few tokens, and have transferred none!
>
> When I look at the machine usage with htop, I see indeed only one of the
> nodes is really getting any usage at all. My heaviest computation nodes
> have a very high capacity value. But the machine which hosts the spout is
> pegged with significant load. I have used almost exclusively shuffle(I
> prefer localOrShuffleGrouping) grouping, but that doesn’t help. I will have
> machines that are simply receiving few tuples to operate on, and those few
> tuples are not transferred (and I admit I don’t know quite what that means).
>
> So, I have two questions:
> 1) Why would a component on a node remote from the spout have a lower
> Emitted count, and have a Transferred count always at zero?
>
> 2) What might cause my high capacity (typically over 1) to not be
> offloaded to a more idle machine?

Mime
View raw message