hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thulasi Ram Naidu P (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-2443) Wrong delimiter is getting picked up for structs inside an array.
Date Tue, 13 Sep 2011 19:45:09 GMT
Wrong delimiter is getting picked up for structs inside an array.
-----------------------------------------------------------------

                 Key: HIVE-2443
                 URL: https://issues.apache.org/jira/browse/HIVE-2443
             Project: Hive
          Issue Type: Bug
          Components: Serializers/Deserializers
            Reporter: Thulasi Ram Naidu P
            Priority: Minor


I am trying to create table with multiple level of delimiters. But the default LazySimpleSerDe
doesn't pick up the second serializer for serializing a struct inside an array which I specified
using COLLECTION ITEMS DELIMITED BY.
My table looks like this:

create external table if not exists mytable(col1 bigint, col2 string,
col3 string, col4 double, col5 double, col6 double, col7 double, col8
array<struct<id1:string, id2:string, id3:string, id4:string,
id5:int>>)
       ROW FORMAT DELIMITED
       FIELDS TERMINATED BY '\t'
       COLLECTION ITEMS TERMINATED BY ',:'
Location '<FILEPATH>';

Input data:

123456  XYZ1    RANDOM  1       1       1       1       x1:y1:z1:w1:5,x2:y2:z2:w1:5

When I do "Select * from mytable" I am expecting output to be 
123456  XYZ1    RANDOM  1.0     1.0     1.0     1.0     [{"id1":"x1","id2":"y1","id3":"z1","id4":"w1","id5":5},{"id1":"x2","id2":"y2","id3":"z2","id4":"w1","id5":5}]

However, it is returning,
123456  XYZ1    RANDOM  1.0     1.0     1.0     1.0
[{"id1":"x1:y1:z1:w1:5","id2":null,"id3":null,"id4":null,"id5":null},{"id1":"x2:y2:z2:w1:5","id2":null,"id3":null,"id4":null,"id5":null}]

But when I changed the schema of table as
create external table if not exists mytable(col1 bigint, col2 string,
col3 string, col4 double, col5 double, col6 double, col7 double, col8
array<struct<id1:string, id2:string, id3:string, id4:string,
id5:int>>)
       ROW FORMAT DELIMITED
       FIELDS TERMINATED BY '\t'
       COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'

Now the select query is returning the values correctly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message