Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2905E18FC1 for ; Tue, 1 Mar 2016 04:29:18 +0000 (UTC) Received: (qmail 73523 invoked by uid 500); 1 Mar 2016 04:29:16 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 73453 invoked by uid 500); 1 Mar 2016 04:29:16 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 73443 invoked by uid 99); 1 Mar 2016 04:29:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2016 04:29:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id DDCD21A01E5 for ; Tue, 1 Mar 2016 04:29:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.804 X-Spam-Level: ** X-Spam-Status: No, score=2.804 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HK_RANDOM_ENVFROM=0.626, HK_RANDOM_FROM=0.999, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id I6qtR3SrTCsH for ; Tue, 1 Mar 2016 04:29:14 +0000 (UTC) Received: from mail-lb0-f173.google.com (mail-lb0-f173.google.com [209.85.217.173]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 0F1485F642 for ; Tue, 1 Mar 2016 04:29:14 +0000 (UTC) Received: by mail-lb0-f173.google.com with SMTP id ed16so14106781lbb.0 for ; Mon, 29 Feb 2016 20:29:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=c9TfJB4snv6taRa+TOuH5lEr775LyfbOrFKItG5JeH8=; b=t7shr+5bZ9n7gKhomOXjDHHbDy8bt8ylS+izeDwEwwh1VZ669jOZK6eK3l+xs23uMX sjmYEdhZYo8i6+a5h2GMN+T/INjIi+Hx8lkQ5egvcbXxkMHkHf95mf3Cf+vgf6jZIcMT wlOYPVbK46EJMf9y1F5EmfdDEvVBjQx5XsBwQ0HffO4ndb51cvfY8oCqqTLE0+7RZbJ1 7+yPs14NlPTmoJodwTjb1AR/2vQsbDcWBcNTmB+xW8alf6WJi+j23t6fKetNJ5WU4rCE 2TAwxwVe2W0DM1Jbzk88dL33zUAlMrWXFaQjvRUwstWe/8ILKgXQedCfzjoUnkhTHMuh G2DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=c9TfJB4snv6taRa+TOuH5lEr775LyfbOrFKItG5JeH8=; b=NaM1P+n/CSkZkPoFmFzbqewGiwwiHgzK9j0kP9G1QpWid2Xu/5lldReWySk+eYa3yd 4w9OAmqdpLGzcO1qkOsMeQjr+xSpIJxuJ7k3lqzQnqJjrBqRqLACQ3nCyhc9i0+/RvNk LebC8E6I59TYjSq+mWWRpT4lD9gQRpZe6Dp05JRtXKR3iXPGtbZLgpDHaYwnSNUnpaXu UR8YlWxHSxVfRmxjWLHQSerKaVxlIofsj0UF5sHDz3bahJPhOePJ+Yd4w/u+9H+o6lwQ /0D/OTDb/0z8dY1bgqL4I/0HxFhDr6tJ3xOwu+tBCHHclsD3A7vaB1nXAuoqyu/hrKz4 PK5A== X-Gm-Message-State: AD7BkJKLBoNQEVr6atGz0bEC3gq2S9ykXhJqpxku2ebhDdPAY8UWwBmiawKepGdan9+vYUTv5JmC1s3om9vzIQ== MIME-Version: 1.0 X-Received: by 10.112.125.102 with SMTP id mp6mr6898636lbb.6.1456806547078; Mon, 29 Feb 2016 20:29:07 -0800 (PST) Received: by 10.25.141.198 with HTTP; Mon, 29 Feb 2016 20:29:07 -0800 (PST) In-Reply-To: <061b01d17372$310a9c60$931fd520$@microfun.com> References: <061b01d17372$310a9c60$931fd520$@microfun.com> Date: Tue, 1 Mar 2016 12:29:07 +0800 Message-ID: Subject: Fwd: Wrong column is picked in HIVE 2.0.0 + TEZ 0.8.2 left join From: Jeff Zhang To: user@tez.apache.org, user@hive.apache.org Content-Type: multipart/alternative; boundary=089e01161ab2bec723052cf53456 --089e01161ab2bec723052cf53456 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable + hive mail list, more likely hive issue. ---------- Forwarded message ---------- From: GAO Chi Date: Tue, Mar 1, 2016 at 12:24 PM Subject: Wrong column is picked in HIVE 2.0.0 + TEZ 0.8.2 left join To: user@tez.apache.org Hi all, We encountered a strange behavior after upgrading to HIVE 2.0.0 + TEZ 0.8.2. I simplified our query to this: SELECT a.key, a.a_one, b.b_one, a.a_zero, b.b_zero FROM ( SELECT 11 key, 0 confuse_you, 1 a_one, 0 a_zero ) a LEFT JOIN ( SELECT 11 key, 0 confuse_you, 1 b_one, 0 b_zero ) b ON a.key =3D b.key ; Above query generates this unexpected result: INFO : Status: Running (Executing on YARN cluster with App id application_1456723490535_3653) INFO : Map 1: 0/1 Map 2: 0/1 INFO : Map 1: 0/1 Map 2: 0(+1)/1 INFO : Map 1: 0(+1)/1 Map 2: 0(+1)/1 INFO : Map 1: 0(+1)/1 Map 2: 1/1 INFO : Map 1: 1/1 Map 2: 1/1 INFO : Completed executing command(queryId=3Dhive_20160301115630_0a0dbee5-ba4b-45e7-b027-085f655640fd)= ; Time taken: 10.225 seconds INFO : OK +--------+----------+----------+-----------+-----------+--+ | a.key | a.a_one | b.b_one | a.a_zero | b.b_zero | +--------+----------+----------+-----------+-----------+--+ | 11 | 1 | 0 | 0 | 1 | +--------+----------+----------+-----------+-----------+--+ If you change the constant value of subquery-b=E2=80=99s confuse_you column= from 0 to 2, the problem disappears. The plan returned from EXPLAIN shows the incorrect one is picking _col1 and _col2, while the correct one is picking _col2 and _col3 form sub query b. Seems it cannot distinguish 2 columns with same constant value? Anyone encountered similar problem? Thanks! Chi --=20 Best Regards Jeff Zhang --089e01161ab2bec723052cf53456 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

+ hive mail list, =C2=A0 more likely hive i= ssue.


---------- Forwarde= d message ----------
From: GAO Chi <chi.gao@microfun= .com>
Date: Tue, Mar 1, 2016 at 12:24 PM
Subject: Wrong= column is picked in HIVE 2.0.0 + TEZ 0.8.2 left join
To: user@tez.apache.org


Hi all,=

=C2=A0

We encountered a strange behavior after upgrading to HIVE 2.= 0.0 + TEZ 0.8.2. =C2=A0

=C2= =A0

I simplified our query to this:=

=C2=A0

SELECT

=C2=A0 a.key,

=C2=A0 a.a_one,

=C2=A0 b.b_one,

=C2= =A0 a.a_zero,

=C2=A0 b.b_zero

FROM

(

=C2=A0=C2=A0=C2=A0 SELECT=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 11 key= ,

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0 = confuse_you,

=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 1 a_one,

=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 0 a_zero

) a=

LEFT JOIN

(

=C2=A0=C2=A0=C2=A0 SELE= CT

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1= 1 key,

=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 0 confuse_you,

=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 1 b_one,

=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 0 b_zero

) b =

ON a.key =3D b.key

;

<= /u>=C2=A0

=C2=A0<= /span>

Above query generates this unexpected = result:

=C2=A0=

INFO=C2=A0 : Status: Running (Exe= cuting on YARN cluster with App id application_1456723490535_3653)

=C2=A0

<= p class=3D"MsoNormal">INFO=C2=A0 : Map 1: 0/1=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 Map 2: 0/1

INFO= =C2=A0 : Map 1: 0/1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Map 2: 0(+1)/1<= /u>

INFO=C2=A0 : Map 1: 0(+1)/1=C2=A0 = Map 2: 0(+1)/1

INFO=C2= =A0 : Map 1: 0(+1)/1=C2=A0 Map 2: 1/1

INFO=C2=A0 : Map 1: 1/1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Map 2: = 1/1

INFO=C2=A0 : Complet= ed executing command(queryId=3Dhive_20160301115630_0a0dbee5-ba4b-45e7-b027-= 085f655640fd); Time taken: 10.225 seconds

INFO=C2=A0 : OK

+--------+----------+----------+-----------+-----------+--+

| a.key=C2=A0 | a.a_one=C2= =A0 | b.b_one=C2=A0 | a.a_zero=C2=A0 | b.b_zero=C2=A0 |

+--------+----------+----------+-----------= +-----------+--+

| 11=C2= =A0=C2=A0=C2=A0=C2=A0 | 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 | 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=

+--------+----------+-----= -----+-----------+-----------+--+

=C2=A0

If you change the cons= tant value of subquery-b=E2=80=99s confuse_you column from 0 to 2, the prob= lem disappears. The plan returned from EXPLAIN shows the incorrect one is p= icking _col1 and _col2, while the correct one is picking _col2 and _col3 fo= rm sub query b.

=C2=A0

Seems it cannot distinguish 2 columns with same= constant value?

=C2=A0

=C2=A0

Anyone encountered similar problem?

=C2=A0

=C2=A0

Thanks!=

=C2=A0

C= hi

=C2=A0

<= /span>



--
Best Regards

Jeff Zhang
--089e01161ab2bec723052cf53456--