hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora Skarra-Gallagher <aur...@yahoo-inc.com>
Subject Re: UDAF documentation
Date Sat, 12 Mar 2011 00:44:19 GMT
Hi,

Did you actually call those functions directly from your unit tests? I'm looking for examples
of that working, but all I see reference to are tests to make sure the query produces the
expected output (rather than directly testing the UDAF).

-Aurora

On Mar 11, 2011, at 3:44 PM, Christopher, Pat wrote:

> Awesome, awesome.  That's what I had pieced together from Steve and Ed's emails.  Glad
to get confirmation on it.
> 
> Its also what I did for my unit testing.  I also called everything with null arguments
to make sure those got handled gracefully.
> 
> Pat
> 
> -----Original Message-----
> From: Aurora Skarra-Gallagher [mailto:aurora@yahoo-inc.com] 
> Sent: Friday, March 11, 2011 3:40 PM
> To: user@hive.apache.org
> Cc: Steven Wong
> Subject: Re: UDAF documentation
> 
> Hadoop: The Definitive Guide has a good section on this. Chapter 12: Hive: User Defined
Functions. It has a diagram that shows how things are called and when. The example I'm looking
at shows this sequence:
> 
> (first instance)
> init()
> iterate(1)
> iterate(2)
> iterate(3)
> terminatePartial()
> 
> (second instance)
> init()
> iterate(4)
> iterate(2)
> terminatePartial()
> 
> (then)
> init()
> merge(3)
> merge(4)
> terminate()
> 
> The UDAF being described is a max integer function, hence the merge ending up with the
highest integer from each instance.
> 
> -Aurora
> 
> On Mar 11, 2011, at 9:54 AM, Christopher, Pat wrote:
> 
>> Ahh, perfect.  The docs don't agree terribly well but the case study is great.  The
context for when merge() gets called was not clear to me.
>> 
>> Thanks guys!
>> 
>> Pat
>> 
>> -----Original Message-----
>> From: Steven Wong [mailto:swong@netflix.com] 
>> Sent: Thursday, March 10, 2011 6:24 PM
>> To: user@hive.apache.org
>> Cc: Christopher, Pat
>> Subject: RE: UDAF documentation
>> 
>> Take a look at http://wiki.apache.org/hadoop/Hive/GenericUDAFCaseStudy, in case you
haven't found it already.
>> 
>> 
>> -----Original Message-----
>> From: Edward Capriolo [mailto:edlinuxguru@gmail.com] 
>> Sent: Thursday, March 10, 2011 6:18 PM
>> To: user@hive.apache.org
>> Cc: Christopher, Pat
>> Subject: Re: UDAF documentation
>> 
>> On Thu, Mar 10, 2011 at 8:27 PM, Christopher, Pat
>> <patrick.christopher@hp.com> wrote:
>>> Hi Guys,
>>> 
>>> I'm writing a UDAF to run against hive 0.5 or hive 0.7.  The documentation I
>>> can find says to implement UDAFEvaluator and ensure that you implement
>>> init() , aggregate() and evaluate().  However, all of the examples I can
>>> find implement init(), iterate(), merge(), terminatePartial() and
>>> terminate().
>>> 
>>> 
>>> 
>>> What's the difference and where I can find the documentation on how to write
>>> a UDAF?
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Pat
>> 
>> At time the documentation may lag behind the code. I would checkout
>> the hive source code for the version you are working with and base
>> your work on other already existing UDAF's that are similar.
>> 
>> Edward
>> 
> 


Mime
View raw message