hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tor Ivry <tork...@gmail.com>
Subject Re: Hive queries returning all NULL values.
Date Tue, 26 Aug 2014 08:05:57 GMT
Raymond - you were the closest.
Parquet field names contained '::' ex. bag1::user_name

Hope it will help anyone in the future

Thanks for all your help

Tor



On Sun, Aug 17, 2014 at 7:50 PM, Raymond Lau <raymond.lau.ca@gmail.com>
wrote:

> Do your field names in your parquet files contain upper case letters by
> any chance ex. userName?  Hive will not read the data of external tables if
> they are not completely lower case field names, it doesn't convert them
> properly in the case of external tables.
> On Aug 17, 2014 8:00 AM, "hadoop hive" <hadoophive@gmail.com> wrote:
>
>> Take a small set of data like 2-5 line and insert it...
>>
>> After that you can try insert first 10 column and then next 10 till you
>> fund your problematic column
>> On Aug 17, 2014 8:37 PM, "Tor Ivry" <torkale@gmail.com> wrote:
>>
>>> Is there any way to debug this?
>>>
>>> We are talking about many fields here.
>>> How can I see which field has the mismatch?
>>>
>>>
>>>
>>> On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive <hadoophive@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> You check the data type you have provided while creating external
>>>> table, it should match with data in files.
>>>>
>>>> Thanks
>>>> Vikas Srivastava
>>>> On Aug 17, 2014 7:07 PM, "Tor Ivry" <torkale@gmail.com> wrote:
>>>>
>>>>>  Hi
>>>>>
>>>>>
>>>>>
>>>>> I have a hive (0.11) table with the following create syntax:
>>>>>
>>>>>
>>>>>
>>>>> CREATE EXTERNAL TABLE events(
>>>>>
>>>>> …
>>>>>
>>>>> )
>>>>>
>>>>> PARTITIONED BY(dt string)
>>>>>
>>>>>   ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
>>>>>
>>>>>   STORED AS
>>>>>
>>>>>     INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
>>>>>
>>>>>     OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
>>>>>
>>>>>     LOCATION '/data-events/success’;
>>>>>
>>>>>
>>>>>
>>>>> Query runs fine.
>>>>>
>>>>>
>>>>> I add hdfs partitions (containing snappy.parquet files).
>>>>>
>>>>>
>>>>>
>>>>> When I run
>>>>>
>>>>> hive
>>>>>
>>>>> > select count(*) from events where dt=“20140815”
>>>>>
>>>>> I get the correct result
>>>>>
>>>>>
>>>>>
>>>>> *Problem:*
>>>>>
>>>>> When I run
>>>>>
>>>>> hive
>>>>>
>>>>> > select * from events where dt=“20140815” limit 1;
>>>>>
>>>>> I get
>>>>>
>>>>> OK
>>>>>
>>>>> NULL NULL NULL NULL NULL NULL NULL 20140815
>>>>>
>>>>>
>>>>>
>>>>> *The same query in Impala returns the correct values.*
>>>>>
>>>>>
>>>>>
>>>>> Any idea what could be the issue?
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Tor
>>>>>
>>>>
>>>

Mime
View raw message