Return-Path: Delivered-To: apmail-hive-user-archive@www.apache.org Received: (qmail 44973 invoked from network); 12 Mar 2011 00:45:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 12 Mar 2011 00:45:01 -0000 Received: (qmail 68412 invoked by uid 500); 12 Mar 2011 00:45:00 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 68375 invoked by uid 500); 12 Mar 2011 00:45:00 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 68367 invoked by uid 99); 12 Mar 2011 00:45:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Mar 2011 00:45:00 +0000 X-ASF-Spam-Status: No, hits=1.1 required=5.0 tests=NO_RDNS_DOTCOM_HELO,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.145.54.171] (HELO mrout1.yahoo.com) (216.145.54.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Mar 2011 00:44:53 +0000 Received: from sp1-ex07cas01.ds.corp.yahoo.com (sp1-ex07cas01.ds.corp.yahoo.com [216.252.116.137]) by mrout1.yahoo.com (8.14.4/8.14.4/y.out) with ESMTP id p2C0iLrA099864; Fri, 11 Mar 2011 16:44:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=yahoo-inc.com; s=cobra; t=1299890661; bh=qMEdzYlmaNwLiEetlB7Gsm++gNa2FsVLiv8tKa974No=; h=From:To:CC:Date:Subject:Message-ID:References:In-Reply-To: Content-Type:Content-Transfer-Encoding:MIME-Version; b=IwSkVa6O2tS3Xbky0Z8DCbF4TC/OodpgBvY46nR2PX3MB8YMz+UTpWZujTEhVGqsU O8kzSHF52WVrpAAJNXMH75v2LZ/NHAWpcAGXk0FEaxfTeU9QidSVDmCyRMnJXxUQtk x0ADCzXYtBKXsPdrbc6cLJvDbiIuPdkf8yCvKHpg= Received: from SP1-EX07VS02.ds.corp.yahoo.com ([216.252.116.135]) by sp1-ex07cas01.ds.corp.yahoo.com ([216.252.116.137]) with mapi; Fri, 11 Mar 2011 16:44:21 -0800 From: Aurora Skarra-Gallagher To: "user@hive.apache.org" , "Christopher, Pat" CC: Steven Wong Date: Fri, 11 Mar 2011 16:44:19 -0800 Subject: Re: UDAF documentation Thread-Topic: UDAF documentation Thread-Index: AcvgTp/5cQBz2h2HQaKJi1hUxcyJAQ== Message-ID: References: <4F6B25AFFFCAFE44B6259A412D5F9B102DED87E0@ExchMBX104.netflix.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Hi, Did you actually call those functions directly from your unit tests? I'm lo= oking for examples of that working, but all I see reference to are tests to= make sure the query produces the expected output (rather than directly tes= ting the UDAF). -Aurora On Mar 11, 2011, at 3:44 PM, Christopher, Pat wrote: > Awesome, awesome. That's what I had pieced together from Steve and Ed's = emails. Glad to get confirmation on it. >=20 > Its also what I did for my unit testing. I also called everything with n= ull arguments to make sure those got handled gracefully. >=20 > Pat >=20 > -----Original Message----- > From: Aurora Skarra-Gallagher [mailto:aurora@yahoo-inc.com]=20 > Sent: Friday, March 11, 2011 3:40 PM > To: user@hive.apache.org > Cc: Steven Wong > Subject: Re: UDAF documentation >=20 > Hadoop: The Definitive Guide has a good section on this. Chapter 12: Hive= : User Defined Functions. It has a diagram that shows how things are called= and when. The example I'm looking at shows this sequence: >=20 > (first instance) > init() > iterate(1) > iterate(2) > iterate(3) > terminatePartial() >=20 > (second instance) > init() > iterate(4) > iterate(2) > terminatePartial() >=20 > (then) > init() > merge(3) > merge(4) > terminate() >=20 > The UDAF being described is a max integer function, hence the merge endin= g up with the highest integer from each instance. >=20 > -Aurora >=20 > On Mar 11, 2011, at 9:54 AM, Christopher, Pat wrote: >=20 >> Ahh, perfect. The docs don't agree terribly well but the case study is = great. The context for when merge() gets called was not clear to me. >>=20 >> Thanks guys! >>=20 >> Pat >>=20 >> -----Original Message----- >> From: Steven Wong [mailto:swong@netflix.com]=20 >> Sent: Thursday, March 10, 2011 6:24 PM >> To: user@hive.apache.org >> Cc: Christopher, Pat >> Subject: RE: UDAF documentation >>=20 >> Take a look at http://wiki.apache.org/hadoop/Hive/GenericUDAFCaseStudy, = in case you haven't found it already. >>=20 >>=20 >> -----Original Message----- >> From: Edward Capriolo [mailto:edlinuxguru@gmail.com]=20 >> Sent: Thursday, March 10, 2011 6:18 PM >> To: user@hive.apache.org >> Cc: Christopher, Pat >> Subject: Re: UDAF documentation >>=20 >> On Thu, Mar 10, 2011 at 8:27 PM, Christopher, Pat >> wrote: >>> Hi Guys, >>>=20 >>> I'm writing a UDAF to run against hive 0.5 or hive 0.7. The documentat= ion I >>> can find says to implement UDAFEvaluator and ensure that you implement >>> init() , aggregate() and evaluate(). However, all of the examples I ca= n >>> find implement init(), iterate(), merge(), terminatePartial() and >>> terminate(). >>>=20 >>>=20 >>>=20 >>> What's the difference and where I can find the documentation on how to = write >>> a UDAF? >>>=20 >>>=20 >>>=20 >>> Thanks, >>>=20 >>> Pat >>=20 >> At time the documentation may lag behind the code. I would checkout >> the hive source code for the version you are working with and base >> your work on other already existing UDAF's that are similar. >>=20 >> Edward >>=20 >=20