hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ch Wan <xmu.wc.2...@gmail.com>
Subject Re: orc ppd bug report
Date Thu, 22 Jan 2015 15:52:30 GMT
Hi. We use hive0.13.1 currently and encountered this bug.

I tried the test case in trunk and got the correct result. But with some
more complex sql, the result is still incorrect.

I found the patch HIVE-8707
<https://issues.apache.org/jira/browse/HIVE-8707> makes the simple test
case work well, but it doesn't fix this bug at all.

2015-01-07 13:27 GMT+08:00 wzc <wzc1989@gmail.com>:

> I tried on hive trunk last time and it didn't work.i'll try again.
> Thank you for your help.
>
> On 2015年1月7日 周三 at 02:29 Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> Hi
>>
>> Which version of hive are you using? I tried your test case in hive trunk
>> and it seems to work fine. In both cases where PPD enabled and disabled I
>> am getting 3 as the result.
>>
>> - Prasanth
>>
>>
>> On Sun, Jan 4, 2015 at 3:04 PM, wzc <wzc1989@gmail.com> wrote:
>>
>>> Recently we find a bug with orc ppd,  here is the testcase:
>>>
>>>  use test;
>>> create table if not exists test_orc_src (a int, b int, c int)
>>> stored as orc;
>>> create table if not exists test_orc_src2 (a int, b int, d int)
>>> stored as orc;
>>> insert overwrite table test_orc_src select 1,2,3 from dim.city
>>> limit 1;
>>> insert overwrite table test_orc_src2 select 1,2,4 from dim.city
>>> limit 1;
>>> set hive.auto.convert.join = false;
>>> select
>>>   tb.c
>>> from test.test_orc_src tb
>>> join test.test_orc_src2 tm
>>> on tb.a = tm.awhere tb.b = 2
>>>
>>> The correct answer for the above query is 3, while it returns empty.We
>>> find that orc ppd use READ_COLUMN_NAMES_CONF_STR property to get the
>>> required column list, it's not well constructed when there exists some
>>> table whose storage path is prefix of some other table path. This bug
>>> is relate to HIVE-1903
>>> <https://issues.apache.org/jira/browse/HIVE-1903%20> , IN HiveInputFormat#pushProjectionsAndFilters
>>> it use prefix match for to get all alias associated with the given path,
>>> which I think is not very suitable.  I dont know why we shall do prefix
>>> match here instead of equal match.
>>>  Any help is appreciated.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>

Mime
View raw message