hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14086) org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro schema file
Date Tue, 31 Jan 2017 23:57:51 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847746#comment-15847746
] 

Sergey Shelukhin commented on HIVE-14086:
-----------------------------------------

Note that since 2.0 (HIVE-11985) the column names for serdes with external schemas are generally
not stored in metastore anymore.

> org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro schema file
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-14086
>                 URL: https://issues.apache.org/jira/browse/HIVE-14086
>             Project: Hive
>          Issue Type: Bug
>          Components: API
>            Reporter: Lars Volker
>         Attachments: avro.json, avroremoved.json, avro.sql
>
>
> Consider this table, using an external Avro schema file:
> {noformat}
> CREATE TABLE avro_table
>   PARTITIONED BY (str_part STRING)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   TBLPROPERTIES (
>     'avro.schema.url'='hdfs://localhost:20500/tmp/avro.json'
>   );
> {noformat}
> This will populate the "COLUMNS_V2" metastore table with the correct column information
(as per HIVE-6308). The columns of this table can then be queried via the Hive API, for example
by calling {{.getSd().getCols()}} on a {{org.apache.hadoop.hive.metastore.api.Table}} object.
> Changes to the avro.schema.url file - either changing where it points to or changing
its contents - will be reflected in the output of {{describe formatted avro_table}} *but not*
in the result of the {{.getSd().getCols()}} API call. Instead it looks like Hive only reads
the Avro schema file internally, but does not expose the information therein via its API.
> Is there a way to obtain the effective Table information via Hive? Would it make sense
to fix table retrieval so calls to {{get_table}} return the correct set of columns?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message