hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dean Ye (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17577) UNION without DISTINCT doesn't work
Date Sun, 24 Sep 2017 02:28:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178057#comment-16178057
] 

Dean Ye commented on HIVE-17577:
--------------------------------

[~janulatha]
Below are all the command I used to produce this issue. I've split them into 3 parts with
comments.

Please feel free to let me know if I could provide anything else. Many thanks for looking
into this issue.

------------------1. hive shell and create the new table----------------------
$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/opt/hive/lib/hive-common-2.1.1.jar!/hive-log4j2.properties
Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider
using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive (default)> use dev;
OK
Time taken: 0.948 seconds
hive (dev)> create table dean_test(id int) stored as orc;
OK
Time taken: 0.652 seconds
hive (dev)> insert into dean_test values(1);
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions.
Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = yeruidian_20170924101016_e4f2589c-1e19-4977-8a4f-21cad3b42b8e
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1496659744394_561227, Tracking URL = http://hadoop-master0.s.qima-inc.com:8088/proxy/application_1496659744394_561227/
Kill Command = /opt/hadoop/bin/hadoop job  -kill job_1496659744394_561227
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2017-09-24 10:10:37,534 Stage-1 map = 0%,  reduce = 0%
2017-09-24 10:10:44,923 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.15 sec
MapReduce Total cumulative CPU time: 3 seconds 150 msec
Ended Job = job_1496659744394_561227
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://yz-cluster/user/hive/warehouse/dev.db/dean_test/.hive-staging_hive_2017-09-24_10-10-16_618_9043764873474269799-1/-ext-10000
Loading data to table dev.dean_test
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 3.15 sec   HDFS Read: 4156 HDFS Write: 251 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 150 msec
OK
Time taken: 35.136 seconds

------------------2. the error when use only UNION----------------------
hive (dev)> select id from dean_test union select id from dean_test;
FAILED: ParseException line 1:31 missing EOF at 'select' near 'union'

------------------3. works with UNION ALL and UNION DISTINCT----------------------
hive (dev)> select id from dean_test union all select id from dean_test;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions.
Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = yeruidian_20170924101133_b2501c25-d2d5-4e10-8809-091d87b4022a
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1496659744394_561235, Tracking URL = http://hadoop-master0.s.qima-inc.com:8088/proxy/application_1496659744394_561235/
Kill Command = /opt/hadoop/bin/hadoop job  -kill job_1496659744394_561235
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2017-09-24 10:11:48,729 Stage-1 map = 0%,  reduce = 0%
2017-09-24 10:11:54,910 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.08 sec
MapReduce Total cumulative CPU time: 2 seconds 80 msec
Ended Job = job_1496659744394_561235
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1   Cumulative CPU: 2.08 sec   HDFS Read: 4588 HDFS Write: 115 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 80 msec
OK
1
1
Time taken: 26.951 seconds, Fetched: 2 row(s)
hive (dev)> select id from dean_test UNION DISTINCT select id from dean_test;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions.
Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = yeruidian_20170924102513_91f10773-356c-4295-bd23-664f4825326b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1496659744394_561287, Tracking URL = http://hadoop-master0.s.qima-inc.com:8088/proxy/application_1496659744394_561287/
Kill Command = /opt/hadoop/bin/hadoop job  -kill job_1496659744394_561287
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-09-24 10:25:31,669 Stage-1 map = 0%,  reduce = 0%
2017-09-24 10:25:38,957 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.71 sec
2017-09-24 10:25:46,171 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.98 sec
MapReduce Total cumulative CPU time: 3 seconds 980 msec
Ended Job = job_1496659744394_561287
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.98 sec   HDFS Read: 8464 HDFS Write:
101 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 980 msec
OK
1
Time taken: 38.262 seconds, Fetched: 1 row(s)

> UNION without DISTINCT doesn't work
> -----------------------------------
>
>                 Key: HIVE-17577
>                 URL: https://issues.apache.org/jira/browse/HIVE-17577
>             Project: Hive
>          Issue Type: Bug
>          Components: CLI
>    Affects Versions: 2.1.1
>         Environment: Hive 2.1.1
> Compiled by jcamachorodriguez on Tue Nov 29 19:46:12 GMT 2016
> From source with checksum 569ad6d6e5b71df3cb04303183948d90
> Test tables in both TEXTFILE and ORC format.
>            Reporter: Dean Ye
>            Priority: Minor
>
> We just upgrade Hive from 1.2.1 to 2.1.1, But the UNION command worked before just return

> missing EOF at 'select' near 'union'. It works when I use UNION DISDINCT but doesn't
without DISTINCT keyword.
> below are copied from command line.
> hive (dev)> select id from deantest union select id from dean_test;
> FAILED: ParseException line 1:34 missing EOF at 'select' near 'union'
> hive (dev)> select id from deantest union distinct select id from dean_test;
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions.
Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
> Query ID = yeruidian_20170922112411_4a370da1-efb3-4464-a2f1-0e84e06ef5b6



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message