spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-23525) ALTER TABLE CHANGE COLUMN doesn't work for external hive table
Date Thu, 08 Mar 2018 05:57:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiao Li updated SPARK-23525:
----------------------------
    Fix Version/s: 2.2.2

> ALTER TABLE CHANGE COLUMN doesn't work for external hive table
> --------------------------------------------------------------
>
>                 Key: SPARK-23525
>                 URL: https://issues.apache.org/jira/browse/SPARK-23525
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Pavlo Skliar
>            Assignee: Jiang Xingbo
>            Priority: Major
>             Fix For: 2.2.2, 2.3.1, 2.4.0
>
>
> {code:java}
> print(spark.sql("""
> SHOW CREATE TABLE test.trends
> """).collect()[0].createtab_stmt)
> /// OUTPUT
> CREATE EXTERNAL TABLE `test`.`trends`(`id` string COMMENT '', `metric` string COMMENT
'', `amount` bigint COMMENT '')
> COMMENT ''
> PARTITIONED BY (`date` string COMMENT '')
> ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> WITH SERDEPROPERTIES (
>   'serialization.format' = '1'
> )
> STORED AS
>   INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
>   OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION 's3://xxxxx/xxxxx/xxxx'
> TBLPROPERTIES (
>   'transient_lastDdlTime' = '1519729384',
>   'last_modified_time' = '1519645652',
>   'last_modified_by' = 'pavlo',
>   'last_castor_run_ts' = '1513561658.0'
> )
> spark.sql("""
> DESCRIBE test.trends
> """).collect()
> // OUTPUT
> [Row(col_name='id', data_type='string', comment=''),
>  Row(col_name='metric', data_type='string', comment=''),
>  Row(col_name='amount', data_type='bigint', comment=''),
>  Row(col_name='date', data_type='string', comment=''),
>  Row(col_name='# Partition Information', data_type='', comment=''),
>  Row(col_name='# col_name', data_type='data_type', comment='comment'),
>  Row(col_name='date', data_type='string', comment='')]
> spark.sql("""alter table test.trends change column id id string comment 'unique identifier'""")
> spark.sql("""
> DESCRIBE test.trends
> """).collect()
> // OUTPUT
> [Row(col_name='id', data_type='string', comment=''), Row(col_name='metric', data_type='string',
comment=''), Row(col_name='amount', data_type='bigint', comment=''), Row(col_name='date',
data_type='string', comment=''), Row(col_name='# Partition Information', data_type='', comment=''),
Row(col_name='# col_name', data_type='data_type', comment='comment'), Row(col_name='date',
data_type='string', comment='')]
> {code}
> The strange is that I've assigned comment to the id field from hive successfully, and
it's visible in Hue UI, but it's still not visible in from spark, and any spark requests doesn't
have effect on the comments.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message