storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas L. Redman" <>
Subject Re: Nodes underutilized
Date Tue, 15 Sep 2020 17:39:08 GMT
2.2.0, I just upgraded not long ago.

> On Sep 14, 2020, at 9:28 AM, Rui Abreu <> wrote:
> Hi Thomas,
> Which version of Storm are you using?
> On Sun, 13 Sep 2020 at 20:23, Thomas L. Redman < <>>
> Sorry, I had previously sent this from a different email address, not sure how well that
would work with this service, hence this re-send.
> I’m running storm on a 3 node cluster, 32 physical cores in each node. I have a complex
topology with one spout which is a singleton, connected to several other bolts most all of
which doing natural language processing. Most of these are pretty heavy weight. The input
spout is easily capable of outpacing the downstream bolts. I get good performance, but on
only one node, even though I specify 3 worker nodes for my topology. StormUI indicates for
any given component that the executors for that token on the idle machines have emitted very
few tokens, and have transferred none!
> When I look at the machine usage with htop, I see indeed only one of the nodes is really
getting any usage at all. My heaviest computation nodes have a very high capacity value. But
the machine which hosts the spout is pegged with significant load. I have used almost exclusively
shuffle(I prefer localOrShuffleGrouping) grouping, but that doesn’t help. I will have machines
that are simply receiving few tuples to operate on, and those few tuples are not transferred
(and I admit I don’t know quite what that means).
> So, I have two questions:
> 1) Why would a component on a node remote from the spout have a lower Emitted count,
and have a Transferred count always at zero?
> 2) What might cause my high capacity (typically over 1) to not be offloaded to a more
idle machine?

View raw message