storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Powis <spo...@salesforce.com>
Subject Re: Regarding storm & Kafka Configuration.
Date Tue, 21 Nov 2017 04:47:11 GMT
1. Parallelism - You can set a maximum of 3, one for each partition in your
topic.  Typically, this will net you the fastest way to get messages out of
Kafka and into your topology, but doing your own testing/benchmarks would
be best to know for sure.
2. How many workers - This probably depends on what kind of work your
topology is doing.  Is it IO bound? Memory Bound? CPU Bound?
3. Max pending - Are you using timeouts/tracking tuples through your
topology?  Typically you want this high enough such that your bolts are not
starved for things to work on, but not so high that tuples are queued up
waiting to be processed and timeout before they can be worked on.  The
biggest trick here is your "total tuples in flight" is equal to (Number Of
Spout Instances * Your Configured Max Spout Pending).   For example, if you
set max pending to 1000, and have 3 spout instances, you can have ~3000
tuples in flight.

On Tue, Nov 21, 2017 at 12:55 PM, Mahabaleshwar <
mahabaleshwar.n@trinitymobility.com> wrote:

> Hi,
>
>
>
> I am using 3 Node Kafka Cluster and i have created one topic called
> iot_gateway with 3 partition & 3 replication factor. My doubt is in storm
> Kafka spout configuration:
>
>
>
> 1.       How much parallelism hint should give?
>
> 2.       How much worker should give?
>
> 3.       How much max pending messages should configure?
>
> 4.       How should maintain task & partition relation?
>
>
>
> I need your help friends.
>
>
>
> Thanks,
>
> Mahabaleshwar
>
>
>

Mime
View raw message