flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Hogan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8414) Gelly performance seriously decreases when using the suggested parallelism configuration
Date Mon, 15 Jan 2018 14:33:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326285#comment-16326285

Greg Hogan commented on FLINK-8414:

You certainly can measure scalability but as you have discovered the performance will not
be monotonically increasing. Redistributing operators require a channel between each pair
of tasks, so with a parallelism of 2^7 you will have 2^14 channels between each task for each

There are many reasons to use Flink and Gelly, but for some use cases for certain algorithms you
may even get better performance with a single-threaded implementation. See "Scalability!
But at what COST?". ConnectedComponents and PageRank require, respectively, no and very little
intermediate data, whereas the similarity measures JaccardIndex and AdamicAdar as well as
triangle metrics such as ClusteringCoefficient process super-linear intermediate data and
benefit much more from Flink's scalability. When comparing against non-distributed implementations
it is important to note that all Gelly algorithms process generic data, whereas many "optimized"
algorithms assume compact integer representations.

> Gelly performance seriously decreases when using the suggested parallelism configuration
> ----------------------------------------------------------------------------------------
>                 Key: FLINK-8414
>                 URL: https://issues.apache.org/jira/browse/FLINK-8414
>             Project: Flink
>          Issue Type: Bug
>          Components: Configuration, Documentation, Gelly
>            Reporter: flora karniav
>            Priority: Minor
> I am running Gelly examples with different datasets in a cluster of 5 machines (1 Jobmanager
and 4 Taskmanagers) of 32 cores each.
> The number of Slots parameter is set to 32 (as suggested) and the parallelism to 128
(32 cores*4 taskmanagers).
> I observe a vast performance degradation using these suggested settings than setting
parallelism.default to 16 for example were the same job completes at ~60 seconds vs ~140 in
the 128 parallelism case.
> Is there something wrong in my configuration? Should I decrease parallelism and -if so-
will this inevitably decrease CPU utilization?
> Another matter that may be related to this is the number of partitions of the data. Is
this somehow related to parallelism? How many partitions are created in the case of parallelism.default=128?

This message was sent by Atlassian JIRA

View raw message