Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 157089689 for ; Tue, 7 Aug 2012 17:36:49 +0000 (UTC) Received: (qmail 73793 invoked by uid 500); 7 Aug 2012 17:36:47 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 73743 invoked by uid 500); 7 Aug 2012 17:36:47 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 73729 invoked by uid 99); 7 Aug 2012 17:36:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 17:36:47 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kulkarni.swarnim@gmail.com designates 209.85.220.176 as permitted sender) Received: from [209.85.220.176] (HELO mail-vc0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 17:36:38 +0000 Received: by vcbfl11 with SMTP id fl11so5406585vcb.35 for ; Tue, 07 Aug 2012 10:36:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=2b27dNmc8zsYLgSvKIqwb2JyLNBOA6pwLLickRG+5ds=; b=DoMWuhFRh7ZB2dH12zOpWuf0liVRvDdMToGCH/+KsWPrd8fd+RimiwuoKU1GfbIQ5I cRMzsoz8eK1l1ufei47x+FgmZcNcGnGLWO46MWqOUP6DH77AM4+OKdb1HGu+VEbT7Jh2 DeGwJVayQxjYjFfLu+uTP2jCkKzFgCt9YExDOfrTn6OFWfuqwc+VxBeL4U/MAoxBRWMF yKZjvgBn6j/ekBMfgpzyZyBbjzLqIBerRRAbAVTrCHpoiurET6Xrp1YrjBJJGofU9Y82 fqDwiJmQRdYbyDQ1Yazu8ZBnHHI3DJw9BXNU8N69hYwPb2UZy3/curbFAOMiwypdASqy fWUA== Received: by 10.52.38.65 with SMTP id e1mr5559490vdk.110.1344360977882; Tue, 07 Aug 2012 10:36:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.227.201 with HTTP; Tue, 7 Aug 2012 10:35:57 -0700 (PDT) In-Reply-To: References: From: "kulkarni.swarnim@gmail.com" Date: Tue, 7 Aug 2012 12:35:57 -0500 Message-ID: Subject: Re: Custom UserDefinedFunction in Hive To: user@hive.apache.org Cc: dev@hive.apache.org Content-Type: multipart/alternative; boundary=bcaec51d2a38895e8204c6b06ee9 --bcaec51d2a38895e8204c6b06ee9 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Have you tried using EXPLAIN[1] on your query? I usually like to use that to get a better understanding of what my query is actually doing and debugging at other times. [1] https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain On Tue, Aug 7, 2012 at 12:20 PM, Raihan Jamal wrote= : > Hi Jan, > > > I figured that out, it is working fine for me now. The only question I > have is, if I am doing like this- > > > > SELECT * FROM REALTIME where dt=3D yesterdaydate('yyyyMMdd') LIMIT 10; > > > > Then the above query will be evaluated as below right? > > > > SELECT * FROM REALTIME where dt=3D =9120120806=92 LIMIT 10; > > > > So that means it will look for data in the corresponding dt partition *(2= 0120806) > *only right as above table is partitioned on dt column ? And it will not > scan the whole table right?** > > > > *Raihan Jamal* > > > > On Mon, Aug 6, 2012 at 10:56 PM, Jan Dolin=E1r wrot= e: > >> Hi Jamal, >> >> Check if the function really returns what it should and that your data >> are really in yyyyMMdd format. You can do this by simple query like this= : >> >> SELECT dt, yesterdaydate('yyyyMMdd') FROM REALTIME LIMIT 1; >> >> I don't see anything wrong with the function itself, it works well for m= e >> (although I tested it in hive 0.7.1). The only thing I would change abou= t >> it would be to optimize it by calling 'new' only at the time of >> construction and reusing the object when the function is called, but tha= t >> should not affect the functionality at all. >> >> Best regards, >> Jan >> >> >> >> >> On Tue, Aug 7, 2012 at 3:39 AM, Raihan Jamal wrot= e: >> >>> *Problem* >>> >>> I created the below UserDefinedFunction to get the yesterday's day in >>> the format I wanted as I will be passing the format into this below met= hod >>> from the query. >>> >>> >>> >>> *public final class YesterdayDate extends UDF {* >>> >>> * * >>> >>> * public String evaluate(final String format) { * >>> >>> * DateFormat dateFormat =3D new >>> SimpleDateFormat(format); * >>> >>> * Calendar cal =3D Calendar.getInstance(= );* >>> >>> * cal.add(Calendar.DATE, -1); * >>> >>> * return >>> dateFormat.format(cal.getTime()).toString(); * >>> >>> * } * >>> >>> *}* >>> >>> >>> >>> >>> >>> So whenever I try to run the query like below by adding the jar to >>> classpath and creating the temporary function yesterdaydate, I always g= et >>> zero result back- >>> >>> >>> >>> hive> create temporary function *yesterdaydate* as >>> 'com.example.hive.udf.YesterdayDate'; >>> >>> OK >>> >>> Time taken: 0.512 seconds >>> >>> >>> >>> Below is the query I am running- >>> >>> >>> >>> *hive> SELECT * FROM REALTIME where dt=3D yesterdaydate('yyyyMMdd') LIM= IT >>> 10;* >>> >>> *OK* >>> >>> * * >>> >>> And I always get zero result back but the data is there in that table >>> for Aug 5th.** >>> >>> >>> >>> What wrong I am doing? Any suggestions will be appreciated. >>> >>> >>> >>> >>> >>> NOTE:- As I am working with Hive 0.6 so it doesn=92t support variable >>> substitution thing, so I cannot use hiveconf here and the above table h= as >>> been partitioned on dt(date) column.** >>> >> >> > --=20 Swarnim --bcaec51d2a38895e8204c6b06ee9--