storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristopher Kane <kkane.l...@gmail.com>
Subject Increasing worker parallelism decreases throughput and increases tuple timeout
Date Tue, 06 Sep 2016 12:40:29 GMT
Hi everyone.

I have a simple topology that uses the Avro serializer (
https://github.com/apache/storm/blob/master/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/avro/ConfluentAvroSerializer.java)
and writes to Elasticsearch.

The topology is like this:

Kafka (raw scheme) -> Avro deserializer -> Elasticsearch

This topology runs well with one worker, however, once I add one more
worker (total of two) and change nothing else, the topology throughput
drops and tuples start timing out.

I've attached visualvm/jstatd to the workers when in multi worker mode -
and added some jmx configs to the worker opts - but I am unable to see
anything glaring.

I've never seen Storm act this way but have also never worked with a custom
serializer so assume that it is the culprit but I cannot explain why.

Any pointers would be appreciated.

Kris

Mime
View raw message