hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhay Bansal <abhaybansal.1...@gmail.com>
Subject Re: Predicate pushdown optimisation not working for ORC
Date Thu, 03 Apr 2014 06:00:08 GMT
I was able to resolve the issue by setting "hive.optimize.index.filter" to
true.

In the hadoop logs
syslog:2014-04-03 05:44:51,204 INFO
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included column ids =
3,8,13
syslog:2014-04-03 05:44:51,204 INFO
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included columns names =
sourceipv4address,sessionid,url
syslog:2014-04-03 05:44:51,216 INFO
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: ORC pushdown predicate:
leaf-0 = (EQUALS sourceipv4address 1809657989)

I can now see the ORC pushdown predicate.

Thanks,
-Abhay


On Thu, Apr 3, 2014 at 11:14 AM, Stephen Boesch <javadba@gmail.com> wrote:

> HI Abhay,
>   What is the DDL for your "test" table?
>
>
> 2014-04-02 22:36 GMT-07:00 Abhay Bansal <abhaybansal.1988@gmail.com>:
>
> I am new to Hive, apologise for asking such a basic question.
>>
>> Following exercise was done with hive .12 and hadoop 0.20.203
>>
>> I created a ORC file form java, and pushed it into a table with the same
>> schema. I checked the conf
>> property <property><name>hive.optimize.ppd</name><value>true</value></property>
>> which should ideally use the ppd optimisation.
>>
>> I ran a query "select sourceipv4address,sessionid,url from test where
>> sourceipv4address="dummy";"
>>
>> Just to see if the ppd optimization is working I checked the hadoop logs
>> where I found
>>
>> ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03
>> 05:01:39,913 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included
>> column ids = 3,8,13
>> ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03
>> 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included
>> columns names = sourceipv4address,sessionid,url
>> ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03
>> 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: *No
>> ORC pushdown predicate*
>>
>>  I am not sure which part of it I missed. Any help would be appreciated.
>>
>> Thanks,
>> -Abhay
>>
>
>

Mime
View raw message