spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chenliang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-26543) Support the coordinator to demerminte post-shuffle partitions more reasonably
Date Sat, 05 Jan 2019 05:02:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-26543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

chenliang updated SPARK-26543:
------------------------------
    Description: 
For SparkSQL ,when we open AE by 'set spark.sql.adapative.enable=true',the ExchangeCoordinator
will introduced to determine the number of post-shuffle partitions. But in some certain conditions,the
coordinator performed not very well, there are always some tasks retained and they worked
with Shuffle Read Size / Records 0.0B/0 ,We could increase the spark.sql.adaptive.shuffle.targetPostShuffleInputSize
to solve this,but this action is unreasonable as targetPostShuffleInputSize Should not be
set too large. As follow:

 !screenshot-1.png! !15_24_38__12_27_2018.jpg!

We can filter the useless partition(0B) with ExchangeCoorditinator automatically

  was:


For SparkSQL ,when we open AE by 'set spark.sql.adapative.enable=true',the ExchangeCoordinator
will introduced to determine the number of post-shuffle partitions. But in some certain conditions,the
coordinator performed not very well, there are always some tasks retained and they worked
with Shuffle Read Size / Records 0.0B/0 ,We could increase the spark.sql.adaptive.shuffle.targetPostShuffleInputSize
to solve this,but this action is unreasonable as targetPostShuffleInputSize Should not be
set too large. As follow:

!15_24_38__12_27_2018.jpg!

We can filter the useless partition(0B) with ExchangeCoorditinator automatically


> Support the coordinator to demerminte post-shuffle partitions more reasonably
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-26543
>                 URL: https://issues.apache.org/jira/browse/SPARK-26543
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.3.2
>            Reporter: chenliang
>            Priority: Major
>             Fix For: 2.3.0
>
>         Attachments: screenshot-1.png
>
>
> For SparkSQL ,when we open AE by 'set spark.sql.adapative.enable=true',the ExchangeCoordinator
will introduced to determine the number of post-shuffle partitions. But in some certain conditions,the
coordinator performed not very well, there are always some tasks retained and they worked
with Shuffle Read Size / Records 0.0B/0 ,We could increase the spark.sql.adaptive.shuffle.targetPostShuffleInputSize
to solve this,but this action is unreasonable as targetPostShuffleInputSize Should not be
set too large. As follow:
>  !screenshot-1.png! !15_24_38__12_27_2018.jpg!
> We can filter the useless partition(0B) with ExchangeCoorditinator automatically



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message