hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "pandees waran" <pande...@gmail.com>
Subject Re: UDAF vs built in function
Date Sun, 11 Aug 2013 12:46:00 GMT
Thanks Nitin.
I am using the maxrow udaf for  obtaining the latest record within a group.(


https://github.com/scribd/hive-udaf-maxrow)
This can be easily achieved by row_number function available in 0.11.
Though the functionality is same , will there be any difference in execution layer (resources
usage ) if I use row_number in stead of udaf?
In general, you have udaf and a built in function with same functionality ,which one should
we prefer ?
—
Sent from Mailbox for iPad


On Sun, Aug 11, 2013 at 6:09 pm, Nitin Pawar <nitinpawar432@gmail.com="mailto:nitinpawar432@gmail.com">>
wrote:
Can you explain what exactly you want to achieve? 

are you trying to replace the UDFs you have written with the ones shipped with hive-0.11? 


Windowing functions do a great feature when you want to lead or lag or rank the rows based
on values inside few rows grouped together. 



If you can explain your use case in detail along with approach, people may be able to suggest
some improvements. 



On Sun, Aug 11, 2013 at 4:53 PM, pandees waran <pandeesh@gmail.com> wrote:
Hi,


Before the advent of windowing and analytical functions in 0.11, we depend on UDAF for achieving
the desired analytical functionality in 0.8,

So, I would like to replace the UDAF with the available windowing and analytic function as
part of 0.11 upgrade .
Is there any best practices while choosing whether UDAF or built in functions? Will there
be any performance gain in this?

Please let me know your views 


Thanks,
Pandeesh
—
Sent from Mailbox for iPad







-- 
Nitin Pawar
Mime
View raw message