hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <>
Subject Re: nested UDFs on Partition column
Date Thu, 19 Apr 2012 10:50:35 GMT
as per my understanding,

In this case hive needs to look for all the partitions because it does not
have the value before hand on the partition check and note the udfs are
executed on the mapred and not on hive client side.

I would suggest you write a hive query in a file and replace the partition
value with a variable
something like

for partitionValue in values

          hive $HIVEPARAMS -hiveconf  partition=$partition -e hivequery.hql

and then in hivequery.sql you can refer the variable with

where column_name = '${hiveconf:partition}'

I may be wrong in interpreting the execution pattern of hivequery but this
approach solved my problem

On Thu, Apr 19, 2012 at 3:27 PM, Ramkumar <>wrote:

> Hi,
> I have a table partitioned by local_date.  When I write a query with
> WHERE local_date = =date_add('2011-12-07',3) ,
> hive executes the UDF ahead and looks only into the specific partitions.
> But when the udf becomes more complex like
> WHERE local_date = date_sub(to_date(from_unixtime(unix_timestamp())),3),
> hive looks through all the partitions even though the above function  can
> very well be computed ahead of time and optimize the query.  Is this
> behaviour intentional ? And is there a workaround other than hardcoding the
> date or using a param?
> Thanks,
> Ramkumar

Nitin Pawar

View raw message