hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohammad Kamrul Islam (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (HIVE-5803) Support CTAS from a non-avro table to an avro table
Date Tue, 12 Nov 2013 23:04:17 GMT

     [ https://issues.apache.org/jira/browse/HIVE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mohammad Kamrul Islam reassigned HIVE-5803:
-------------------------------------------

    Assignee: Carl Steinbach

> Support CTAS from a non-avro table to an avro table
> ---------------------------------------------------
>
>                 Key: HIVE-5803
>                 URL: https://issues.apache.org/jira/browse/HIVE-5803
>             Project: Hive
>          Issue Type: Task
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Carl Steinbach
>
> Hive currently does not work with HQL like :
> CREATE TABLE <AVRO-BASE-TABLE> as SELECT * from <NON_AVRO_TABLE>;
> Actual it works successfully. But when I run "SELECT * from <AVRO-BASED-TABLE>
.." it fails.
> This JIRA depends on HIVE-3159 that translates TypeInfo to Avro schema.
> Findings so far: CTAS uses internal column names (in place of using the column names
provided in select) when crating the AVRO data file. In other words, avro data file has column
names in this form  of: _col0, _col1 where as table column names are different.
> I tested with the following test cases and it failed:
> - verify 1) can create table using create table as select from non-avro table 2) LOAD
avro data into new table and read data from the new table
> CREATE TABLE simple_kv_txt (key STRING, value STRING) STORED AS TEXTFILE;
> DESCRIBE simple_kv_txt;
> LOAD DATA LOCAL INPATH '../data/files/kv1.txt' INTO TABLE simple_kv_txt;
> SELECT * FROM simple_kv_txt ORDER BY KEY;
> CREATE TABLE copy_doctors ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' as SELECT key as key, value
as value FROM simple_kv_txt;
> DESCRIBE copy_doctors;
> SELECT * FROM copy_doctors;
>  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message