hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora Skarra-Gallagher <aur...@yahoo-inc.com>
Subject Re: UDAF documentation
Date Sat, 12 Mar 2011 03:00:12 GMT
I'll just keep responding to myself. ;)

I ended up figuring out how to do it. I just used junit and called init, iterate, terminatePartial,
etc from inside the unit test. After knowing a typical flow of function calls (as I mentioned
below), the main other gotcha is making sure to have a new UDAF object for each instance.
For example, in my example below, there would be three separate UDAF instances.

-Aurora

On Mar 11, 2011, at 5:02 PM, Aurora Skarra-Gallagher wrote:

> I'm looking for something like this, but for a UDAF instead of a UDF:
> http://svn.apache.org/repos/asf/hive/branches/branch-0.7/ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFDateDiff.java
> 
> -Aurora
> 
> On Mar 11, 2011, at 4:44 PM, Aurora Skarra-Gallagher wrote:
> 
>> Hi,
>> 
>> Did you actually call those functions directly from your unit tests? I'm looking
for examples of that working, but all I see reference to are tests to make sure the query
produces the expected output (rather than directly testing the UDAF).
>> 
>> -Aurora
>> 
>> On Mar 11, 2011, at 3:44 PM, Christopher, Pat wrote:
>> 
>>> Awesome, awesome.  That's what I had pieced together from Steve and Ed's emails.
 Glad to get confirmation on it.
>>> 
>>> Its also what I did for my unit testing.  I also called everything with null
arguments to make sure those got handled gracefully.
>>> 
>>> Pat
>>> 
>>> -----Original Message-----
>>> From: Aurora Skarra-Gallagher [mailto:aurora@yahoo-inc.com] 
>>> Sent: Friday, March 11, 2011 3:40 PM
>>> To: user@hive.apache.org
>>> Cc: Steven Wong
>>> Subject: Re: UDAF documentation
>>> 
>>> Hadoop: The Definitive Guide has a good section on this. Chapter 12: Hive: User
Defined Functions. It has a diagram that shows how things are called and when. The example
I'm looking at shows this sequence:
>>> 
>>> (first instance)
>>> init()
>>> iterate(1)
>>> iterate(2)
>>> iterate(3)
>>> terminatePartial()
>>> 
>>> (second instance)
>>> init()
>>> iterate(4)
>>> iterate(2)
>>> terminatePartial()
>>> 
>>> (then)
>>> init()
>>> merge(3)
>>> merge(4)
>>> terminate()
>>> 
>>> The UDAF being described is a max integer function, hence the merge ending up
with the highest integer from each instance.
>>> 
>>> -Aurora
>>> 
>>> On Mar 11, 2011, at 9:54 AM, Christopher, Pat wrote:
>>> 
>>>> Ahh, perfect.  The docs don't agree terribly well but the case study is great.
 The context for when merge() gets called was not clear to me.
>>>> 
>>>> Thanks guys!
>>>> 
>>>> Pat
>>>> 
>>>> -----Original Message-----
>>>> From: Steven Wong [mailto:swong@netflix.com] 
>>>> Sent: Thursday, March 10, 2011 6:24 PM
>>>> To: user@hive.apache.org
>>>> Cc: Christopher, Pat
>>>> Subject: RE: UDAF documentation
>>>> 
>>>> Take a look at http://wiki.apache.org/hadoop/Hive/GenericUDAFCaseStudy, in
case you haven't found it already.
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Edward Capriolo [mailto:edlinuxguru@gmail.com] 
>>>> Sent: Thursday, March 10, 2011 6:18 PM
>>>> To: user@hive.apache.org
>>>> Cc: Christopher, Pat
>>>> Subject: Re: UDAF documentation
>>>> 
>>>> On Thu, Mar 10, 2011 at 8:27 PM, Christopher, Pat
>>>> <patrick.christopher@hp.com> wrote:
>>>>> Hi Guys,
>>>>> 
>>>>> I'm writing a UDAF to run against hive 0.5 or hive 0.7.  The documentation
I
>>>>> can find says to implement UDAFEvaluator and ensure that you implement
>>>>> init() , aggregate() and evaluate().  However, all of the examples I
can
>>>>> find implement init(), iterate(), merge(), terminatePartial() and
>>>>> terminate().
>>>>> 
>>>>> 
>>>>> 
>>>>> What's the difference and where I can find the documentation on how to
write
>>>>> a UDAF?
>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Pat
>>>> 
>>>> At time the documentation may lag behind the code. I would checkout
>>>> the hive source code for the version you are working with and base
>>>> your work on other already existing UDAF's that are similar.
>>>> 
>>>> Edward
>>>> 
>>> 
>> 
> 


Mime
View raw message