spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.
Date Tue, 16 May 2017 18:00:06 GMT

    [ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012818#comment-16012818
] 

Brian Zhang commented on SPARK-15616:
-------------------------------------

Hello,
Just wondering what's the current status of this issue? I think this fix would be really helpful.

Thanks!

> Metastore relation should fallback to HDFS size of partitions that are involved in Query
if statistics are not available.
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-15616
>                 URL: https://issues.apache.org/jira/browse/SPARK-15616
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Lianhui Wang
>
> Currently if some partitions of a partitioned table are used in join operation we rely
on Metastore returned size of table to calculate if we can convert the operation to Broadcast
join. 
> if Filter can prune some partitions, Hive can prune partition before determining to use
broadcast joins according to HDFS size of partitions that are involved in Query.So sparkSQL
needs it that can improve join's performance for partitioned table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message