hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Ruchovets <oruchov...@gmail.com>
Subject Re: HIVE ORC table returns NULLs ( EMR 5.9 Hive 2.3.0 )
Date Wed, 25 Oct 2017 07:04:35 GMT
Yes, It is exactly my point. Since the file has the data  (orc is valid),
why hive returns NULLs?
I tested it s3 , hdfs , hive , beeline. the behavior is the same:

    select count (*) returns 10.
    select * returns NULLs ...

What is the way to debug this problem? Any configuration, logging. I am
using defaults of EMR.

Please advice.
Thanks, Oleg.






On Wed, Oct 25, 2017 at 2:30 PM, Owen O'Malley <owen.omalley@gmail.com>
wrote:

> The file has the data. I'm not sure what Hive is doing wrong.
>
> owen@laptop> java -jar ../tools/target/orc-tools-1.5.0-SNAPSHOT-uber.jar
>> data ~/Downloads/Country.orc
>> Processing data file /Users/owen/Downloads/Country.orc [length: 392]
>> {"Id":1,"Name":"Singapore"}
>> {"Id":2,"Name":"Malaysia"}
>> {"Id":3,"Name":"India"}
>> {"Id":4,"Name":"Hong Kong"}
>> {"Id":5,"Name":"Macau"}
>> {"Id":6,"Name":"Thailand"}
>> {"Id":7,"Name":"Indonesia"}
>> {"Id":8,"Name":"Philippines"}
>> {"Id":9,"Name":"Dubai"}
>> {"Id":10,"Name":"Vietnam"}
>> ____________________________________________________________
>> ____________________________________________________________
>
>
>  .. Owen
>
> On Tue, Oct 24, 2017 at 11:11 PM, Oleg Ruchovets <oruchovets@gmail.com>
> wrote:
>
>> I am creating hive external table ORC (ORC file located on S3).
>>
>> *Command*
>>
>> CREATE EXTERNAL TABLE Table1 (Id INT, Name STRING) STORED AS ORC LOCATION 's3://bucket_name'
>>
>> *After running the query*:
>>
>> Select * from Table1;
>>
>> *Result is*:
>>
>> +-------------------------------------+---------------------------------------+
>> | Table1.id  | Table1.name  |
>> +-------------------------------------+---------------------------------------+
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> | NULL                                | NULL                                  |
>> +-------------------------------------+---------------------------------------+
>>
>> Interesting that the number of returned records 10 and it is correct but
>> all records are NULL. What is wrong, why query returns only NULLs? I am
>> using EMR instances on AWS. Should I configure/check to support ORC format
>> for hive?
>>
>> ORC file attached
>>
>
>

Mime
View raw message