Return-Path: Delivered-To: apmail-hadoop-hive-user-archive@minotaur.apache.org Received: (qmail 57739 invoked from network); 20 Aug 2010 15:58:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 Aug 2010 15:58:51 -0000 Received: (qmail 84884 invoked by uid 500); 20 Aug 2010 15:58:50 -0000 Delivered-To: apmail-hadoop-hive-user-archive@hadoop.apache.org Received: (qmail 84832 invoked by uid 500); 20 Aug 2010 15:58:50 -0000 Mailing-List: contact hive-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-user@hadoop.apache.org Delivered-To: mailing list hive-user@hadoop.apache.org Received: (qmail 84824 invoked by uid 99); 20 Aug 2010 15:58:50 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Aug 2010 15:58:50 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ronakbaps@gmail.com designates 74.125.83.48 as permitted sender) Received: from [74.125.83.48] (HELO mail-gw0-f48.google.com) (74.125.83.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Aug 2010 15:58:29 +0000 Received: by gwj19 with SMTP id 19so1759062gwj.35 for ; Fri, 20 Aug 2010 08:58:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=NDOie7vXIXoCKtgQWH1vDuaZh4IaDyZFH1JVNY5AGXA=; b=nKXBH5YqcYZvcgauCEOClXMdz47xJyEKL+byd94qq7XS8CsxRSDo7WSWT18Dqc7Tdw KN98x8+JJdvslILfIW+0u0Po3ejKGADl6hZfSK9y+5VyQLeJxOQTdWIWkuBMQ3zcwXTu k5e68DXTmEZSk3pIZXJXHO2mnzVGe4GVXXDcA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=EDTck9nx5ofpSvL0r78UB9WG79VAbnc6hNqS/U/yoyTq6BD8htvp7ClCmjkzXWpnYg ZSAJ8gs6Iio2n4uVXWnEHFKxFJXZV0X7cUsFUrh6Mr5Q65yRADZhciVrE8wBKd/Pe0so KiC8QrDYfaVQVRyuERRydLnCN3e4m0iQ61C+w= MIME-Version: 1.0 Received: by 10.90.90.5 with SMTP id n5mr1288459agb.153.1282319888289; Fri, 20 Aug 2010 08:58:08 -0700 (PDT) Received: by 10.231.117.197 with HTTP; Fri, 20 Aug 2010 08:58:08 -0700 (PDT) In-Reply-To: References: Date: Fri, 20 Aug 2010 08:58:08 -0700 Message-ID: Subject: Re: question - how to handle OR with HIVe From: Ronak Bhatt To: hive-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016361e88166e145b048e435db3 X-Virus-Checked: Checked by ClamAV on apache.org --0016361e88166e145b048e435db3 Content-Type: text/plain; charset=ISO-8859-1 This OR and AND are for use in select or condition..for example, select (a OR B) as end_result_trur_or_false from table where id > 20; What I'm looking for is, select sum(amount) from table where id > 20 and (a > 5 OR b < 10) as you can see, I want to use OR in where condition...I could possibly use UNION ALL to simulate the OR condition, but it is lot of writing (duplicating the code) and also not efficient from performance perspective as the table will be scanned multiple times for each select of the union... hope this helps to clarify what kind of OR I'm looking for.... *thanks, ronak* * * *408 504 4847* *My Blog : http://ronakbaps.posterous.com* * * * * On Fri, Aug 20, 2010 at 8:50 AM, Alexey Kovyrin wrote: > It definitely does support it. > http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Logical_Operators > > On Fri, Aug 20, 2010 at 11:37 AM, Ronak Bhatt wrote: > > I read somewhere that hive does NOT support OR condition....that's what I > > was wondering as to how others handle the situation. > > > > > > > > On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout < > apache@jacobrideout.net> > > wrote: > >> > >> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt > wrote: > >> > Hi - I've a following SQL query. What is the way to convert it into > HIVE > >> > runnable format - > >> > Select a.id, count(b.url) > >> > from a, b > >> > where a.id = b.id > >> > and (b.url like '%google.com%' or b.url like '%h.google.com%' or > b.url > >> > like > >> > '%bing%') > >> > and a.exec_date = date(now()) ; ----- get today's data > >> > group by a.id > >> > > >> > thanks, ronak > >> > > >> > >> I think something like: > >> > >> SELECT a.id, COUNT(distinct b.url) > >> FROM a > >> JOIN b on a.id = b.id > >> WHERE > >> (b.url LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR > >> b.url LIKE '%bing%') > >> AND a.exec_date = from_unixtime(unix_timestamp()) > >> GROUP BY a.id; > >> > >> should work, but I didn't test it. You will probably need to format > >> the data string to match your data - hive doesn't really have a date > >> type as such - but instead has date manipulation functions for > >> strings. > >> > >> Take a look at > >> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions > >> > >> > >> Jacob Rideout > > > > > > > > -- > Alexey Kovyrin > http://kovyrin.net/ > --0016361e88166e145b048e435db3 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable This OR and AND are for use in select or condition..for example,

select (a OR B) as end_result_trur_or_false
from table= =A0
where id > 20;

What I'm looki= ng for is,

select sum(amount)=A0
from table=A0
where id > 20 and=A0
(a > 5 =A0OR b < 10)
as you can see, I want to use OR in where condition...I could p= ossibly use UNION ALL to simulate the OR condition, but it is lot of writin= g (duplicating the code) and also not efficient from performance perspectiv= e as the table will be scanned multiple times for each select of the union.= ..

hope this helps to clarify what kind of OR I'm look= ing for....



th= anks, ronak

408 504 4847





On Fri, Aug 20, 2010 at 8:50 AM, Alexey = Kovyrin <alexey@= kovyrin.net> wrote:
It definitely does support it.
http://wiki.apache.org/hadoop/Hive/LanguageManua= l/UDF#Logical_Operators

On Fri, Aug 20, 2010 at 11:37 AM, Ronak Bhatt <ronakbaps@gmail.com> wrote:
> I read somewhere that hive does NOT support OR condition....that's= what I
> was wondering as to how others handle the situation.
>
>
>
> On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout <apache@jacobrideout.net>
> wrote:
>>
>> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ronakbaps@gmail.com> wrote:
>> > Hi - I've a following SQL query. What is the way to conve= rt it into HIVE
>> > runnable format -
>> > Select a.id, co= unt(b.url)
>> > from a, b
>> > where a.id =3D = b.id
>> > and (b.url =A0like '%google.com%' or b.url like '%h.google.com%' or b.url
>> > like
>> > '%bing%')
>> > and a.exec_date =3D date(now()) ; =A0----- get today's da= ta
>> > group by a.id >> >
>> > thanks, ronak
>> >
>>
>> I think something like:
>>
>> SELECT a.id, COUNT(d= istinct b.url)
>> FROM a
>> JOIN b on a.id =3D <= a href=3D"http://b.id" target=3D"_blank">b.id
>> WHERE
>> =A0 =A0(b.url =A0LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR
>> b.url LIKE '%bing%')
>> AND a.exec_date =3D from_unixtime(unix_timestamp())
>> GROUP BY a.id;
>>
>> should work, but I didn't test it. You will probably need to f= ormat
>> the data string to match your data - hive doesn't really have = a date
>> type as such - but instead has date manipulation functions for
>> strings.
>>
>> Take a look at
>> http://wiki.apache.org/hadoop/Hive/Languag= eManual/UDF#Date_Functions
>>
>>
>> Jacob Rideout
>
>



--
Alexey Kovyrin
http://kovyrin.net/

--0016361e88166e145b048e435db3--