hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora Skarra-Gallagher <aur...@yahoo-inc.com>
Subject Re: UDAF documentation
Date Sat, 12 Mar 2011 14:09:59 GMT
No problem. Yeah, I called init first for each instance.

-Aurora

On Mar 11, 2011, at 11:35 PM, "Christopher, Pat" <patrick.christopher@hp.com> wrote:

> Hey sorry, I was moving house all evening and driving around town :)
> 
> I did not automate my unit tests, I created a small frame app to test each function and
make sure it responded appropriately.  Good to know junit will do it.
> 
> For your other question, did you include a call to init() in your udaf's constructor?
I didn't see it in the code you provided.
> 
> Pat
> 
> 
> 
> -- Sent from my Palm Pre
> 
> ________________________________
> On Mar 11, 2011 7:01 PM, Aurora Skarra-Gallagher <aurora@yahoo-inc.com> wrote:
> 
> I'll just keep responding to myself. ;)
> 
> I ended up figuring out how to do it. I just used junit and called init, iterate, terminatePartial,
etc from inside the unit test. After knowing a typical flow of function calls (as I mentioned
below), the main other gotcha is making sure to have a new UDAF object for each instance.
For example, in my example below, there would be three separate UDAF instances.
> 
> -Aurora
> 
> On Mar 11, 2011, at 5:02 PM, Aurora Skarra-Gallagher wrote:
> 
>> I'm looking for something like this, but for a UDAF instead of a UDF:
>> http://svn.apache.org/repos/asf/hive/branches/branch-0.7/ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFDateDiff.java
>> 
>> -Aurora
>> 
>> On Mar 11, 2011, at 4:44 PM, Aurora Skarra-Gallagher wrote:
>> 
>>> Hi,
>>> 
>>> Did you actually call those functions directly from your unit tests? I'm looking
for examples of that working, but all I see reference to are tests to make sure the query
produces the expected output (rather than directly testing the UDAF).
>>> 
>>> -Aurora
>>> 
>>> On Mar 11, 2011, at 3:44 PM, Christopher, Pat wrote:
>>> 
>>>> Awesome, awesome.  That's what I had pieced together from Steve and Ed's
emails.  Glad to get confirmation on it.
>>>> 
>>>> Its also what I did for my unit testing.  I also called everything with null
arguments to make sure those got handled gracefully.
>>>> 
>>>> Pat
>>>> 
>>>> -----Original Message-----
>>>> From: Aurora Skarra-Gallagher [mailto:aurora@yahoo-inc.com]
>>>> Sent: Friday, March 11, 2011 3:40 PM
>>>> To: user@hive.apache.org
>>>> Cc: Steven Wong
>>>> Subject: Re: UDAF documentation
>>>> 
>>>> Hadoop: The Definitive Guide has a good section on this. Chapter 12: Hive:
User Defined Functions. It has a diagram that shows how things are called and when. The example
I'm looking at shows this sequence:
>>>> 
>>>> (first instance)
>>>> init()
>>>> iterate(1)
>>>> iterate(2)
>>>> iterate(3)
>>>> terminatePartial()
>>>> 
>>>> (second instance)
>>>> init()
>>>> iterate(4)
>>>> iterate(2)
>>>> terminatePartial()
>>>> 
>>>> (then)
>>>> init()
>>>> merge(3)
>>>> merge(4)
>>>> terminate()
>>>> 
>>>> The UDAF being described is a max integer function, hence the merge ending
up with the highest integer from each instance.
>>>> 
>>>> -Aurora
>>>> 
>>>> On Mar 11, 2011, at 9:54 AM, Christopher, Pat wrote:
>>>> 
>>>>> Ahh, perfect.  The docs don't agree terribly well but the case study
is great.  The context for when merge() gets called was not clear to me.
>>>>> 
>>>>> Thanks guys!
>>>>> 
>>>>> Pat
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Steven Wong [mailto:swong@netflix.com]
>>>>> Sent: Thursday, March 10, 2011 6:24 PM
>>>>> To: user@hive.apache.org
>>>>> Cc: Christopher, Pat
>>>>> Subject: RE: UDAF documentation
>>>>> 
>>>>> Take a look at http://wiki.apache.org/hadoop/Hive/GenericUDAFCaseStudy,
in case you haven't found it already.
>>>>> 
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Edward Capriolo [mailto:edlinuxguru@gmail.com]
>>>>> Sent: Thursday, March 10, 2011 6:18 PM
>>>>> To: user@hive.apache.org
>>>>> Cc: Christopher, Pat
>>>>> Subject: Re: UDAF documentation
>>>>> 
>>>>> On Thu, Mar 10, 2011 at 8:27 PM, Christopher, Pat
>>>>> <patrick.christopher@hp.com> wrote:
>>>>>> Hi Guys,
>>>>>> 
>>>>>> I'm writing a UDAF to run against hive 0.5 or hive 0.7.  The documentation
I
>>>>>> can find says to implement UDAFEvaluator and ensure that you implement
>>>>>> init() , aggregate() and evaluate().  However, all of the examples
I can
>>>>>> find implement init(), iterate(), merge(), terminatePartial() and
>>>>>> terminate().
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> What's the difference and where I can find the documentation on how
to write
>>>>>> a UDAF?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Pat
>>>>> 
>>>>> At time the documentation may lag behind the code. I would checkout
>>>>> the hive source code for the version you are working with and base
>>>>> your work on other already existing UDAF's that are similar.
>>>>> 
>>>>> Edward
>>>>> 
>>>> 
>>> 
>> 
> 

Mime
View raw message