incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: About PE setParallelism
Date Thu, 20 Jun 2013 07:54:44 GMT

On Jun 20, 2013, at 08:03 , Sky Zhao <sky.zhao@ericsson.com> wrote:

> Hi,
> I set the 2 PEs(logic is very simple, just compare the values) threads is 1,200,300,400,
the execution time is 400s,161s,163s,171s, why I increase threads but the time get slower?
>  
> How I can improve the speed?


Stream.setParallelism() increases parallelism by adding more threads to concurrently process
events. Doing that can be beneficial for I/O bound PEs in particular, but you must be aware
of the associated costs of context switching overhead. This can have a dramatic impact on
performance.

What I would recommend is to use a much lower parallelism number, so that your CPU resources
are fully used, without generating too much overhead (you may check kernel activity and context
switching through the vmstat tool on linux).

In addition, you should be able to improve overall throughput by scaling out, i.e. by deploying
S4 nodes on more machines (with a decent ratio computation/communication in your app).

Note that setParallelism is one of many knobs that you can use to tune performance, and I
encourage you to explore more parameters, taking in consideration the application design,
the workload and the infrastructure. 

Hope this helps,

Matthieu


Mime
View raw message