spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-21317) Avoid unnecessary sort in FileFormatWriter if data is already bucketed
Date Wed, 05 Jul 2017 14:08:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074813#comment-16074813
] 

Apache Spark commented on SPARK-21317:
--------------------------------------

User 'pwoody' has created a pull request for this issue:
https://github.com/apache/spark/pull/18542

> Avoid unnecessary sort in FileFormatWriter if data is already bucketed
> ----------------------------------------------------------------------
>
>                 Key: SPARK-21317
>                 URL: https://issues.apache.org/jira/browse/SPARK-21317
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.1
>            Reporter: Patrick Woody
>
> When bucketing in FileFormatWriter, the partition is always sorted on bucketIdExpression,
the partition id produced by the hash bucketing. If the data is already bucketed in that format,
then this expression will be constant so there is no need to sort.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message