Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DA4E31079B for ; Sun, 22 Feb 2015 02:04:27 +0000 (UTC) Received: (qmail 84154 invoked by uid 500); 22 Feb 2015 02:04:25 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 84094 invoked by uid 500); 22 Feb 2015 02:04:25 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 84083 invoked by uid 99); 22 Feb 2015 02:04:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Feb 2015 02:04:25 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kelphet@gmail.com designates 209.85.220.51 as permitted sender) Received: from [209.85.220.51] (HELO mail-pa0-f51.google.com) (209.85.220.51) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Feb 2015 02:04:20 +0000 Received: by paceu11 with SMTP id eu11so18117882pac.7 for ; Sat, 21 Feb 2015 18:02:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=Ylib1iXVQmye0k4YjP38RH9Gt9Mp1OEliEL9cTlTCbM=; b=JbwSmK052mRrTc6TxQdH4T785W+vZ5AK6XZUITG3Qd3wiXMV3dI/R3jhZAr3e838R5 rMeYazkgieeMOsh5RYCCE83bjvl69FnT5Rp0m0q0BGY9pfZZ8YXCYJUJJJ5tcgjAk3gt KJPxx3QXHiEYo9TBtiOBn+e6ZdTHLg7Em3wefyeYMoBj4qUnJ4JhFn0uQP/Kd+G/QQm9 91PxWxmCcXXD0xepSsXymJ9JsKiY1hfHRJ+pRlNrqEutjuRuFwNYtKvuMFYiNlpXr1k/ MFLaUztgXDUyw4of2ixhBRnGTntYDLYbSMfEgkOKm0/UIpcROqtSxEldWT/aLBR3z02B 67NQ== X-Received: by 10.70.88.142 with SMTP id bg14mr8308617pdb.46.1424570550324; Sat, 21 Feb 2015 18:02:30 -0800 (PST) Received: from [192.168.1.125] ([76.126.213.68]) by mx.google.com with ESMTPSA id b5sm15590322pdj.88.2015.02.21.18.02.24 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 21 Feb 2015 18:02:29 -0800 (PST) Message-ID: <54E938AE.5070105@gmail.com> Date: Sat, 21 Feb 2015 18:02:22 -0800 From: kelphet User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: user@hive.apache.org Subject: Re: Union all with a field 'hard coded' References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------090407090403030502010203" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------090407090403030502010203 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Hi Lefty and Xuefu, Thanks for the rigorous discussion. I confirm that this will apply to UNION (DISTINCT) too according to HIVE-9039. That is to say, It is safe to remove "ALL". Thanks. Best Pengcheng On 2/21/15, 4:09 PM, Xuefu Zhang wrote: > Tech writing assumes true unless proven false. :) > > On Sat, Feb 21, 2015 at 1:13 PM, Lefty Leverenz > > wrote: > > I haven't tried union distinct, but I assume the same rule > applies. > > > Reasonable assumption, so I'll remove "ALL" and see if anyone > contradicts it. (Tech writing by successive approximation.) > > Thanks again. > > -- Lefty > > On Sat, Feb 21, 2015 at 6:27 AM, Xuefu Zhang > wrote: > > I haven't tried union distinct, but I assume the same rule > applies. > > Thanks for putting it together. It looks good to me. > > --Xuefu > > On Fri, Feb 20, 2015 at 11:44 PM, Lefty Leverenz > > wrote: > > Great, thanks Xuefu. So this only applies to UNION ALL, > not UNION DISTINCT? I had wondered about that. > > I made the changes and added some subheadings: Union > wikidoc > -- > Column Aliases for UNION ALL > . > > > Please review it one more time. > > -- Lefty > > On Fri, Feb 20, 2015 at 7:06 AM, Xuefu Zhang > > wrote: > > Hi Lefty, > > The description seems good to me. I just slightly > modified it so that it sounds more "technical", for > your consideration. > > Thanks, > Xuefu > > UNION ALL expected the same schema on both sides of > the expression list. As a result, the following query > may fail with an error message such as "FAILED: > SemanticException 4:47 Schema of both sides of union > should match." > [query] > In such cases, column aliases can be used to force > equal schema: > [corrected query] > > > > On Thu, Feb 19, 2015 at 1:04 AM, Lefty Leverenz > > wrote: > > Xuefu, I've taken a stab at documenting this in > the Union wikidoc > (near > the end). Would you please review it and make any > necessary corrections or additions? > > Thanks. > > -- Lefty > > On Mon, Feb 2, 2015 at 2:02 PM, DU DU > > wrote: > > This is a part of standard SQL syntax, isn't it? > > On Mon, Feb 2, 2015 at 2:22 PM, Xuefu Zhang > > wrote: > > Yes, I think it would be great if this can > be documented. > > --Xuefu > > On Sun, Feb 1, 2015 at 6:34 PM, Lefty > Leverenz > wrote: > > Xuefu, should this be documented in > the Union wikidoc > ? > > > Is it relevant for other query clauses? > > -- Lefty > > On Sun, Feb 1, 2015 at 11:27 AM, > Philippe Kernévez > wrote: > > Perfect. > > Thank you Xuefu. > > Philippe > > On Fri, Jan 30, 2015 at 11:32 PM, > Xuefu Zhang > wrote: > > Use column alias: > > INSERT OVERWRITE TABLE > all_dictionaries_ext > SELECT name, id, category > FROM dictionary > UNION ALL SELECT NAME, ID, > "CAMPAIGN" as category FROM > md_campaigns > > > On Fri, Jan 30, 2015 at 1:41 > PM, Philippe Kernévez > > > wrote: > > Hi all, > > I would like to do union > all with a field that is > hardcoded in the request. > > INSERT OVERWRITE TABLE > all_dictionaries_ext > SELECT name, id, category > FROM dictionary > UNION ALL SELECT NAME, > ID, "CAMPAIGN" FROM > md_campaigns > > Name type is String > Id type is int > Category type is string > > When I run this command I > had an error : > FAILED: SemanticException > 4:47 Schema of both sides > of union should match. > _u1-subquery2 does not > have the field category. > Error encountered near > token 'md_campaigns' > > I supposed that the error > is cause by the String > "CAMPAIGN" which should > not have a type. > > How can do this kind of > union ? > > The union all with 2 hard > coded fields is ok. > INSERT OVERWRITE TABLE > all_dictionaries_ext > SELECT NAME, ID, > "CAMPAIGN" FROM md_campaigns > UNION ALL SELECT NAME, > ID, "AD_SERVER" FROM > md_ad_servers > UNION ALL SELECT NAME, > ID, "AVERTISER" FROM > md_advertisers > UNION ALL SELECT NAME, > ID, "AGENCIES" FROM > md_agencies > > > More debug info : > > 15/01/30 22:34:23 [main]: > INFO parse.ParseDriver: > Parsing command: > INSERT OVERWRITE TABLE > all_dictionaries_ext > SELECT name, id, category > FROM byoa_dictionary > UNION ALL SELECT NAME, > ID, "CAMPAIGN" FROM > md_campaigns > 15/01/30 22:34:23 [main]: > INFO parse.ParseDriver: > Parse Completed > 15/01/30 22:34:23 [main]: > INFO log.PerfLogger: > start=1422653663887 > end=1422653663900 > duration=13 > from=org.apache.hadoop.hive.ql.Driver> > 15/01/30 22:34:23 [main]: > INFO log.PerfLogger: > method=semanticAnalyze > from=org.apache.hadoop.hive.ql.Driver> > 15/01/30 22:34:23 [main]: > INFO > parse.SemanticAnalyzer: > Starting Semantic Analysis > 15/01/30 22:34:23 [main]: > INFO > parse.SemanticAnalyzer: > Completed phase 1 of > Semantic Analysis > 15/01/30 22:34:23 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for source tables > 15/01/30 22:34:23 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for subqueries > 15/01/30 22:34:23 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for source tables > 15/01/30 22:34:23 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for subqueries > 15/01/30 22:34:24 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for > destination tables > 15/01/30 22:34:24 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for source tables > 15/01/30 22:34:24 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for subqueries > 15/01/30 22:34:24 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for > destination tables > 15/01/30 22:34:24 [main]: > INFO > parse.SemanticAnalyzer: > Get metadata for > destination tables > 15/01/30 22:34:24 [main]: > INFO > parse.SemanticAnalyzer: > Completed getting MetaData > in Semantic Analysis > 15/01/30 22:34:24 [main]: > INFO > parse.SemanticAnalyzer: > Not invoking CBO because > the statement has too few > joins > FAILED: SemanticException > 4:47 Schema of both sides > of union should match. > _u1-subquery2 does not > have the field category. > Error encountered near > token 'md_campaigns' > 15/01/30 22:34:24 [main]: > ERROR ql.Driver: FAILED: > SemanticException 4:47 > Schema of both sides of > union should match. > _u1-subquery2 does not > have the field category. > Error encountered near > token 'md_campaigns' > org.apache.hadoop.hive.ql.parse.SemanticException: > 4:47 Schema of both sides > of union should match. > _u1-subquery2 does not > have the field category. > Error encountered near > token 'md_campaigns' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genUnionPlan(SemanticAnalyzer.java:9007) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9600) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9620) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) > at > org.apache.hadoop.hive.ql.Driver.compile(Driver.java:415) > at > org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303) > at > org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1067) > at > org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1129) > at > org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004) > at > org.apache.hadoop.hive.ql.Driver.run(Driver.java:994) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:733) > at > org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) > at > org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) > at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at > java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at > org.apache.hadoop.util.RunJar.main(RunJar.java:136) > > Regards, > -- > Philippe Kernévez > > > > > > -- > Philippe Kernévez > > > > Directeur technique (Suisse), > pkernevez@octo.com > > +41 79 888 33 32 > > > Retrouvez OCTO sur OCTO Talk : > http://blog.octo.com > OCTO Technology http://www.octo.com > > > > > > > -- > Thanks, > Dayong > > > > > > > --------------090407090403030502010203 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi Lefty and Xuefu,

    Thanks for the rigorous discussion. I confirm that this will apply to UNION (DISTINCT) too according to HIVE-9039. That is to say, It is safe to remove "ALL". Thanks.

Best
Pengcheng


On 2/21/15, 4:09 PM, Xuefu Zhang wrote:
Tech writing assumes true unless proven false. :)

On Sat, Feb 21, 2015 at 1:13 PM, Lefty Leverenz <leftyleverenz@gmail.com> wrote:
I haven't tried union distinct, but I assume the same rule applies.

Reasonable assumption, so I'll remove "ALL" and see if anyone contradicts it.  (Tech writing by successive approximation.)

Thanks again.

-- Lefty

On Sat, Feb 21, 2015 at 6:27 AM, Xuefu Zhang <xzhang@cloudera.com> wrote:
I haven't tried union distinct, but I assume the same rule applies.

Thanks for putting it together. It looks good to me.

--Xuefu

On Fri, Feb 20, 2015 at 11:44 PM, Lefty Leverenz <leftyleverenz@gmail.com> wrote:
Great, thanks Xuefu.  So this only applies to UNION ALL, not UNION DISTINCT?  I had wondered about that.

I made the changes and added some subheadings:  Union wikidoc -- Column Aliases for UNION ALL.

Please review it one more time.

-- Lefty

On Fri, Feb 20, 2015 at 7:06 AM, Xuefu Zhang <xzhang@cloudera.com> wrote:
Hi Lefty,

The description seems good to me. I just slightly modified it so that it sounds more "technical", for your consideration.

Thanks,
Xuefu

UNION ALL expected the same schema on both sides of the expression list. As a result, the following query may fail with an error message such as "FAILED: SemanticException 4:47 Schema of both sides of union should match."
[query]
In such cases, column aliases can be used to force equal schema:
[corrected query]



On Thu, Feb 19, 2015 at 1:04 AM, Lefty Leverenz <leftyleverenz@gmail.com> wrote:
Xuefu, I've taken a stab at documenting this in the Union wikidoc (near the end).  Would you please review it and make any necessary corrections or additions?

Thanks.

-- Lefty

On Mon, Feb 2, 2015 at 2:02 PM, DU DU <willddy@gmail.com> wrote:
This is a part of standard SQL syntax, isn't it?

On Mon, Feb 2, 2015 at 2:22 PM, Xuefu Zhang <xzhang@cloudera.com> wrote:
Yes, I think it would be great if this can be documented.

--Xuefu

On Sun, Feb 1, 2015 at 6:34 PM, Lefty Leverenz <leftyleverenz@gmail.com> wrote:
Xuefu, should this be documented in the Union wikidoc?

Is it relevant for other query clauses?

-- Lefty

On Sun, Feb 1, 2015 at 11:27 AM, Philippe Kernévez <pkernevez@octo.com> wrote:
Perfect.

Thank you Xuefu.

Philippe

On Fri, Jan 30, 2015 at 11:32 PM, Xuefu Zhang <xzhang@cloudera.com> wrote:
Use column alias:

INSERT OVERWRITE TABLE all_dictionaries_ext
     SELECT name, id, category FROM dictionary
     UNION ALL SELECT NAME, ID, "CAMPAIGN" as category FROM md_campaigns


On Fri, Jan 30, 2015 at 1:41 PM, Philippe Kernévez <pkernevez@octo.com> wrote:
Hi all,

I would like to do union all with a field that is hardcoded in the request.

   INSERT OVERWRITE TABLE all_dictionaries_ext
     SELECT name, id, category FROM dictionary
     UNION ALL SELECT NAME, ID, "CAMPAIGN" FROM md_campaigns

Name type is String
Id type is int
Category type is string

When I run this command I had an error :
FAILED: SemanticException 4:47 Schema of both sides of union should match. _u1-subquery2 does not have the field category. Error encountered near token 'md_campaigns'

I supposed that the error is cause by the String "CAMPAIGN" which should not have a type.

How can do this kind of union ?

The union all with 2 hard coded fields is ok.
  INSERT OVERWRITE TABLE all_dictionaries_ext
    SELECT NAME, ID, "CAMPAIGN" FROM md_campaigns
     UNION ALL SELECT NAME, ID, "AD_SERVER" FROM md_ad_servers 
     UNION ALL SELECT NAME, ID, "AVERTISER" FROM md_advertisers 
     UNION ALL SELECT NAME, ID, "AGENCIES" FROM md_agencies 


More debug info :

15/01/30 22:34:23 [main]: INFO parse.ParseDriver: Parsing command: 
  INSERT OVERWRITE TABLE all_dictionaries_ext
    SELECT name, id, category FROM byoa_dictionary
    UNION ALL SELECT NAME, ID, "CAMPAIGN" FROM md_campaigns
15/01/30 22:34:23 [main]: INFO parse.ParseDriver: Parse Completed
15/01/30 22:34:23 [main]: INFO log.PerfLogger: </PERFLOG method=parse start=1422653663887 end=1422653663900 duration=13 from=org.apache.hadoop.hive.ql.Driver>
15/01/30 22:34:23 [main]: INFO log.PerfLogger: <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
15/01/30 22:34:23 [main]: INFO parse.SemanticAnalyzer: Starting Semantic Analysis
15/01/30 22:34:23 [main]: INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
15/01/30 22:34:23 [main]: INFO parse.SemanticAnalyzer: Get metadata for source tables
15/01/30 22:34:23 [main]: INFO parse.SemanticAnalyzer: Get metadata for subqueries
15/01/30 22:34:23 [main]: INFO parse.SemanticAnalyzer: Get metadata for source tables
15/01/30 22:34:23 [main]: INFO parse.SemanticAnalyzer: Get metadata for subqueries
15/01/30 22:34:24 [main]: INFO parse.SemanticAnalyzer: Get metadata for destination tables
15/01/30 22:34:24 [main]: INFO parse.SemanticAnalyzer: Get metadata for source tables
15/01/30 22:34:24 [main]: INFO parse.SemanticAnalyzer: Get metadata for subqueries
15/01/30 22:34:24 [main]: INFO parse.SemanticAnalyzer: Get metadata for destination tables
15/01/30 22:34:24 [main]: INFO parse.SemanticAnalyzer: Get metadata for destination tables
15/01/30 22:34:24 [main]: INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
15/01/30 22:34:24 [main]: INFO parse.SemanticAnalyzer: Not invoking CBO because the statement has too few joins
FAILED: SemanticException 4:47 Schema of both sides of union should match. _u1-subquery2 does not have the field category. Error encountered near token 'md_campaigns'
15/01/30 22:34:24 [main]: ERROR ql.Driver: FAILED: SemanticException 4:47 Schema of both sides of union should match. _u1-subquery2 does not have the field category. Error encountered near token 'md_campaigns'
org.apache.hadoop.hive.ql.parse.SemanticException: 4:47 Schema of both sides of union should match. _u1-subquery2 does not have the field category. Error encountered near token 'md_campaigns'
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genUnionPlan(SemanticAnalyzer.java:9007)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9600)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9620)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:415)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1067)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1129)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:733)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Regards,
--
Philippe Kernévez





--
Philippe Kernévez



Directeur technique (Suisse),
pkernevez@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.com





--
Thanks,
Dayong







--------------090407090403030502010203--