spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chaudhary, Umesh" <Umesh.Chaudh...@searshc.com>
Subject RE: Optimizing Streaming from Websphere MQ
Date Mon, 15 Jun 2015 09:10:49 GMT
Hi Akhil,
Thanks for your response.
I have 10 cores which sums of all my 3 machines and I am having 5-10 receivers.
I have tried to test the processed number of records per second by varying number of receivers.
If I am having 10 receivers (i.e. one receiver for each core), then I am not experiencing
any performance benefit from it.
Is it something related to the bottleneck of MQ or Reliable Receiver?

From: Akhil Das [mailto:akhil@sigmoidanalytics.com]
Sent: Saturday, June 13, 2015 1:10 AM
To: Chaudhary, Umesh
Cc: user@spark.apache.org
Subject: Re: Optimizing Streaming from Websphere MQ

How many cores are you allocating for your job? And how many receivers are you having? It
would be good if you can post your custom receiver code, it will help people to understand
it better and shed some light.

Thanks
Best Regards

On Fri, Jun 12, 2015 at 12:58 PM, Chaudhary, Umesh <Umesh.Chaudhary@searshc.com<mailto:Umesh.Chaudhary@searshc.com>>
wrote:
Hi,
I have created a Custom Receiver in Java which receives data from Websphere MQ and I am only
writing the received records on HDFS.

I have referred many forums for optimizing speed of spark streaming application. Here I am
listing a few:


•         Spark Official<http://spark.apache.org/docs/latest/streaming-programming-guide.html#performance-tuning>

•         VIrdata<http://www.virdata.com/tuning-spark/>

•          TD’s Slide (A bit Old but Useful)<http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617>

I got mainly two point for my applicability :


•         giving batch interval as 1 sec

•         Controlling “spark.streaming.blockInterval” =200ms

•         inputStream.repartition(3)

But that did not improve my actual speed (records/sec) of receiver which is MAX 5-10 records
/sec. This is way less from my expectation.
Am I missing something?

Regards,
Umesh Chaudhary
This message, including any attachments, is the property of Sears Holdings Corporation and/or
one of its subsidiaries. It is confidential and may contain proprietary or legally privileged
information. If you are not the intended recipient, please delete it without reading the contents.
Thank you.


This message, including any attachments, is the property of Sears Holdings Corporation and/or
one of its subsidiaries. It is confidential and may contain proprietary or legally privileged
information. If you are not the intended recipient, please delete it without reading the contents.
Thank you.
Mime
View raw message