spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] [Updated] (SPARK-6242) Support replace (drop) column for parquet table
Date Tue, 10 Mar 2015 14:57:38 GMT


Sean Owen updated SPARK-6242:
    Component/s: SQL

> Support replace (drop) column for parquet table
> -----------------------------------------------
>                 Key: SPARK-6242
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.0
>            Reporter: chirag aggarwal
> SPARK-5528 provides a easy way of support for add column to parquet tables. This is done
by using the native parquet capability of merging the schema from all the part-files and _common_metadata
> But, if someone wants to drop a column from the parquet table, this still does not work.
This happens because, the merged schema shall still show the dropped column, but the column
is no more there in metastore. So, the schema's obtained from the two sources do not match,
and hence any subsequent query on this table fails.
> Instead of checking for exact match between the two schemas, spark should only check
if the schema obtained from metastore is subset of parquet merged schema or not. If this check
passes, then the columns present in metastore should be allowed to be referred in the query.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message