hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jakob Homan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-3442) AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating external table
Date Fri, 07 Sep 2012 00:12:07 GMT

    [ https://issues.apache.org/jira/browse/HIVE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450198#comment-13450198
] 

Jakob Homan commented on HIVE-3442:
-----------------------------------

The docs are out of date (my fault).  schema.url and schema.literal got changed to avro.schema.url
and avro.schema.literal during the move to Apache, to be more specific to Avro.  Try with
those.  I'll update the wiki.
                
> AvroSerDe WITH SERDEPROPERTIES 'schema.url' is not working when creating external table
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-3442
>                 URL: https://issues.apache.org/jira/browse/HIVE-3442
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Zhenxiao Luo
>            Assignee: Zhenxiao Luo
>             Fix For: 0.10.0
>
>
> After creating a table and load data into it, I could check that the table is created
successfully, and data is inside:
> DROP TABLE IF EXISTS ml_items;
> CREATE TABLE ml_items(id INT,
>                       title STRING,
>                       release_date STRING,
>                       video_release_date STRING,
>                       imdb_url STRING,
>                       unknown_genre TINYINT,
>                       action TINYINT,
>                       adventure TINYINT,
>                       animation TINYINT,
>                       children TINYINT,
>                       comedy TINYINT,
>                       crime TINYINT,
>                       documentary TINYINT,
>                       drama TINYINT,
>                       fantasy TINYINT,
>                       film_noir TINYINT,
>                       horror TINYINT,
>                       musical TINYINT,
>                       mystery TINYINT,
>                       romance TINYINT,
>                       sci_fi TINYINT,
>                       thriller TINYINT,
>                       war TINYINT,
>                       western TINYINT)
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
>   STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '../data/files/avro_items' INTO TABLE ml_items;
> select * from ml_items ORDER BY id ASC;
> While, the following create external table with AvroSerDe is not working:
> DROP TABLE IF EXISTS ml_items_as_avro;
> CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
>     'schema.url'='${system:test.src.data.dir}/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:${system:test.tmp.dir}/hive-ml-items';
> describe ml_items_as_avro;
> INSERT OVERWRITE TABLE ml_items_as_avro
>   SELECT id, title,
>     imdb_url, unknown_genre, action, adventure, animation, children, comedy, crime,
>     documentary, drama, fantasy, film_noir, horror, musical, mystery, romance,
>     sci_fi, thriller, war, western
>   FROM ml_items;
> ml_items_as_avro is not created with expected schema, as shown in the "describe ml_items_as_avro"
output. The output is below:
> PREHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> PREHOOK: type: DROPTABLE
> POSTHOOK: query: DROP TABLE IF EXISTS ml_items_as_avro
> POSTHOOK: type: DROPTABLE
> PREHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
>     'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> PREHOOK: type: CREATETABLE
> POSTHOOK: query: CREATE EXTERNAL TABLE ml_items_as_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   WITH SERDEPROPERTIES (
>     'schema.url'='/home/cloudera/Code/hive/data/files/avro_items_schema.avsc')
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   LOCATION 'file:/home/cloudera/Code/hive/build/ql/tmp/hive-ml-items'
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@ml_items_as_avro
> PREHOOK: query: describe ml_items_as_avro
> PREHOOK: type: DESCTABLE
> POSTHOOK: query: describe ml_items_as_avro
> POSTHOOK: type: DESCTABLE
> error_error_error_error_error_error_error   string  from deserializer
> cannot_determine_schema string  from deserializer
> check   string  from deserializer
> schema  string  from deserializer
> url string  from deserializer
> and string  from deserializer
> literal string  from deserializer
> FAILED: SemanticException [Error 10044]: Line 3:23 Cannot insert into target table because
column number/types are different 'ml_items_as_avro': Table insclause-0 has 7 columns, but
query has 22 columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message