spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cloud-fan <...@git.apache.org>
Subject [GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...
Date Tue, 21 Aug 2018 01:23:09 GMT
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21859
  
    I don't think this optimization should be done at SQL layer. The `ShuffleWriter` should
treat `RangePartitioner` specially and consume the sampled data in `RangePartitioner` instead
of the input iterator.
    
    By doing that the SQL layer(as well as all other components) can benefit from it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message