storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mahabaleshwar" <mahabaleshwa...@trinitymobility.com>
Subject RE: Regarding storm & Kafka Configuration.
Date Wed, 22 Nov 2017 05:56:36 GMT
Thanks for the great explanation Stephen.

 

From: Stephen Powis [mailto:spowis@salesforce.com] 
Sent: 21 November 2017 10:17
To: user@storm.apache.org
Subject: Re: Regarding storm & Kafka Configuration.

 

 

 

1. Parallelism - You can set a maximum of 3, one for each partition in your topic.  Typically,
this will net you the fastest way to get messages out of Kafka and into your topology, but
doing your own testing/benchmarks would be best to know for sure.

2. How many workers - This probably depends on what kind of work your topology is doing. 
Is it IO bound? Memory Bound? CPU Bound?  

3. Max pending - Are you using timeouts/tracking tuples through your topology?  Typically
you want this high enough such that your bolts are not starved for things to work on, but
not so high that tuples are queued up waiting to be processed and timeout before they can
be worked on.  The biggest trick here is your "total tuples in flight" is equal to (Number
Of Spout Instances * Your Configured Max Spout Pending).   For example, if you set max pending
to 1000, and have 3 spout instances, you can have ~3000 tuples in flight.

 

On Tue, Nov 21, 2017 at 12:55 PM, Mahabaleshwar <mahabaleshwar.n@trinitymobility.com>
wrote:

Hi,

 

I am using 3 Node Kafka Cluster and i have created one topic called iot_gateway with 3 partition
& 3 replication factor. My doubt is in storm Kafka spout configuration:

 

1.       How much parallelism hint should give?

2.       How much worker should give?

3.       How much max pending messages should configure?

4.       How should maintain task & partition relation?

 

I need your help friends.

 

Thanks,

Mahabaleshwar

 

 


Mime
View raw message