spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-25774) Eliminate query anomalies with empty partitions - TRUNCATE, SELECT DISTINCT, etc.
Date Mon, 26 Nov 2018 04:15:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-25774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-25774:
------------------------------------

    Assignee: Apache Spark

> Eliminate query anomalies with empty partitions - TRUNCATE, SELECT DISTINCT, etc.
> ---------------------------------------------------------------------------------
>
>                 Key: SPARK-25774
>                 URL: https://issues.apache.org/jira/browse/SPARK-25774
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.0
>         Environment: Right now, I'm using Cloudera with Spark 2.2.0, but I understand
it's a widespread thing.
>            Reporter: Steven Cardella
>            Assignee: Apache Spark
>            Priority: Major
>
> If you run a spark SQL TRUNCATE TABLE command on a managed table in Hive, it deletes
the files in HDFS but leaves the partitions and partition folder structure.  If you then
SELECT DISTINCT on the partition columns, it returns all the empty partition values.  So,
you can have a SELECT DISTINCT return rows but SELECT * on the same table returns 0 rows.  
> Coming from SQL Server and the like, SELECT DISTINCT always reflects the ROWS, and Impala
works like that as well.  
> I'd like SELECT DISTINCT to reflect rows, not partitions, TRUNCATE TABLE to have the
option to drop partitions, and MSCK REPAIR TABLE to have the option to drop empty partitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message