Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 765B2D2CB for ; Thu, 20 Sep 2012 14:00:36 +0000 (UTC) Received: (qmail 2923 invoked by uid 500); 20 Sep 2012 14:00:34 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 2880 invoked by uid 500); 20 Sep 2012 14:00:34 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 2871 invoked by uid 99); 20 Sep 2012 14:00:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Sep 2012 14:00:34 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.212.48] (HELO mail-vb0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Sep 2012 14:00:28 +0000 Received: by vbme21 with SMTP id e21so3143593vbm.35 for ; Thu, 20 Sep 2012 07:00:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=h9tLrY2k5kVCvZmJ1zBq8ujdvSlwpGkXMm+3swQAHoQ=; b=jnMKJecRRFSoFiPnaYcSLbBhwcT9iQIsv5E4Q1wr6uPCvpEjouHMFlIZ7nkxfRpYMm 6MhIOSQ8AJc/6M0K2eaNHwICGb+r+hYFUfL6Qehj5U2phcI/P/Yk6eU/HKhNuRBvmz2U sGIjUOnjvPWoI6hz09d62K9qXLw0LUZSM+rwAHVlUV+DI5KRE7Kvf48FIsGKmtt4xfG6 7jVfVQk2CG/INPDjbJMdW5RXZslOFm5/H9Vlw6khLVLgdOu7OirrNI325D4TARJRNRAS G3jtVeJDI1SS7/halAR5PzcLf9441OmJrQK5mJp6xorv4+1j40shtty68D719ASD+61K g7bg== MIME-Version: 1.0 Received: by 10.52.22.37 with SMTP id a5mr887535vdf.60.1348149605818; Thu, 20 Sep 2012 07:00:05 -0700 (PDT) Received: by 10.58.64.100 with HTTP; Thu, 20 Sep 2012 07:00:05 -0700 (PDT) In-Reply-To: <1348148635.37561.YahooMailNeo@web121202.mail.ne1.yahoo.com> References: <1348148635.37561.YahooMailNeo@web121202.mail.ne1.yahoo.com> Date: Thu, 20 Sep 2012 16:00:05 +0200 Message-ID: Subject: Re: Hive ignoring buckets when using dynamic where From: Robin Verlangen To: user@hive.apache.org, Bejoy KS Content-Type: multipart/alternative; boundary=20cf30780dc85be0c504ca228a80 X-Gm-Message-State: ALoCoQkerFgbkw01cTibS1/0zyrhyJ1K+y+hhiWJEaiVIsJvO2FE6UFB2ZxlIl7NCw3I+ndHAV+J --20cf30780dc85be0c504ca228a80 Content-Type: text/plain; charset=ISO-8859-1 Hi Bejoy, Thank you for your reply. Is there any way to fix my problem? I want to have a query that has a dynamic range, from now (and in some cases now - x days until now). Best regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E robin@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. 2012/9/20 Bejoy KS > Hi Robin > > The result of 'bdate=to_date(unix_timestamp())' is evaluated during the > runtime of the query. But the data that a query should process is > determined initially before executing the map reduce jobs. That is the > reason the query is running over whole data set. > > When you provide 'bdate='2012-09-01'' the hive parser knows initially > itself what data which all partitions should be taken into account. So this > query runs on only the required partitions and not on whole data. > > To add on , it is not the buckets considered here on where clause but the > partitions. > > Regards, > Bejoy KS > > ------------------------------ > *From:* Robin Verlangen > *To:* user@hive.apache.org > *Sent:* Thursday, September 20, 2012 5:06 PM > *Subject:* Hive ignoring buckets when using dynamic where > > Hi there, > > We're working on some queries that use buckets to improve performance with > like 1000x. However we ran into a problem. When we use a fixed hardcoded > date it works fine: > > SELECT * FROM standard_feed WHERE bdate='2012-09-01' > *Starts a job with 6 mappers, 2 reducers* > > When we use it dynamically: > SELECT * FROM standard_feed WHERE bdate=to_date(unix_timestamp()) > *Starts a job with 1000 mappers, 2 reducers* > * > * > What's the problem here? The result of the to_date of the current > timestamp should be equal to a normal fixed date? Does anyone have a > solution? > > Best regards, > > Robin Verlangen > *Software engineer* > * > * > W http://www.robinverlangen.nl > E robin@us2.nl > > Disclaimer: The information contained in this message and attachments is > intended solely for the attention and use of the named addressee and may be > confidential. If you are not the intended recipient, you are reminded that > the information remains the property of the sender. You must not use, > disclose, distribute, copy, print or rely on this e-mail. If you have > received this message in error, please contact the sender immediately and > irrevocably delete this message and any copies. > > > > --20cf30780dc85be0c504ca228a80 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Bejoy,

Thank you for your reply. Is there any way to = fix my problem? I want to have a query that has a dynamic range, from now (= and in some cases now - x days until now).

Best= regards,=A0

Robin Verlangen
Software engineer


Disclaimer: The information con= tained in this message and attachments is intended solely for the attention= and use of the named addressee and may be confidential. If you are not the= intended recipient, you are reminded that the information remains the prop= erty of the sender. You must not use, disclose, distribute, copy, print or = rely on this e-mail. If you have received this message in error, please con= tact the sender immediately and irrevocably delete this message and any cop= ies.



2012/9/20 Bejoy KS <= ;bejoy_ks@yahoo.com= >
Hi Robin

The result of 'bdate=3Dto_date(unix_timestamp())' is=A0evaluated=A0during the runtime of the query. Bu= t the data that a query should process is determined=A0initially=A0before e= xecuting the map reduce jobs. That is the reason the query is running over whole data set.<= /font>

When you provide '= ;bdate=3D'2012-09-01'' the hive parser = knows initially itself what data which all partitions should be taken into = account. So this query runs on only the required partitions and not on whol= e data.

To add on , it is not the=A0buckets=A0considered here on where clause but = the partitions. =A0
=A0
Regards,
Bejoy KS


<= b>From: Robin Verlangen <robin@us2.nl>
To: user@hive.apache.org
Sent: Thursday, September 20, 2012 5:0= 6 PM
Subject: Hive ignoring buck= ets when using dynamic where

Hi there,

We're working on some queries that us= e buckets to improve performance with like 1000x. However we ran into a pro= blem. When we use a fixed hardcoded date it works fine:

SELECT * FROM standard_feed WHERE bdate=3D'2012-09-01'
Starts a job with 6 mappers, 2 reducers

When = we use it dynamically:
SELECT * FROM standard_feed WHERE bdate=3D= to_date(unix_timestamp())
Starts a job with 1000 mappers, 2 reducers

What's the problem here? The result of the to_date of the c= urrent timestamp should be equal to a normal fixed date? Does anyone have a= solution?

Best regards,=A0

Robin Verlan= gen
Software engineer

W http://www.robinve= rlangen.nl

Disclaimer= : The information contained in this message and attachments is intended sol= ely for the attention and use of the named addressee and may be confidentia= l. If you are not the intended recipient, you are reminded that the informa= tion remains the property of the sender. You must not use, disclose, distri= bute, copy, print or rely on this e-mail. If you have received this message= in error, please contact the sender immediately and irrevocably delete thi= s message and any copies.



--20cf30780dc85be0c504ca228a80--