hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
Date Tue, 30 Apr 2013 20:42:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645952#comment-13645952
] 

Ashutosh Chauhan commented on HIVE-4392:
----------------------------------------

[~navis] You opted to make aggregate column virtual hidden. But as [~rhbutani] this may still
be problematic when these aggregate columns are referenced in having clause. e.g, consider
query 2 from Harish's example with groupby and having clause in subquery. Did you test that.

Also as Harish suggested root problem here is
bq. And the issue is in how the RowResolver is constructed in SelectOp when a '*' is present
in the Select List.

So, proper fix is to correctly handle * and only add requisite columns in RowResolver of select
Op and thereby avoiding duplication of columns in first place. Wondering if you explored that
route ?
                
> Illogical InvalidObjectException throwed when use mulit aggregate functions with star
columns 
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-4392
>                 URL: https://issues.apache.org/jira/browse/HIVE-4392
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>         Environment: Apache Hadoop 0.20.1
> Apache Hive Trunk
>            Reporter: caofangkun
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, HIVE-4392.D10431.3.patch,
HIVE-4392.D10431.4.patch
>
>
> For Example:
> hive (default)> create table liza_1 as 
>               > select *, sum(key), sum(value) 
>               > from new_src;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003
> Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill job_201304191025_0003
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1
> 2013-04-22 11:09:28,017 Stage-1 map = 0%,  reduce = 0%
> 2013-04-22 11:09:34,054 Stage-1 map = 0%,  reduce = 100%
> 2013-04-22 11:09:37,074 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201304191025_0003
> Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
> FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object
name)
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
> MapReduce Jobs Launched: 
> Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 12 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> hive (default)> create table liza_1 as         
>               > select *, sum(key), sum(value) 
>               > from new_src                   
>               > group by key, value;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004
> Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill job_201304191025_0004
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1
> 2013-04-22 11:11:58,945 Stage-1 map = 0%,  reduce = 0%
> 2013-04-22 11:12:01,964 Stage-1 map = 0%,  reduce = 100%
> 2013-04-22 11:12:04,982 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201304191025_0004
> Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
> FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object
name)
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
> MapReduce Jobs Launched: 
> Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> But the following tow Queries  work:
> hive (default)> create table liza_1 as select * from new_src;
> Total MapReduce jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006
> Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill job_201304191025_0006
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
> 2013-04-22 11:15:00,681 Stage-1 map = 0%,  reduce = 0%
> 2013-04-22 11:15:03,697 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201304191025_0006
> Stage-4 is selected by condition resolver.
> Stage-3 is filtered out by condition resolver.
> Stage-5 is filtered out by condition resolver.
> Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive-scratchdir/hive_2013-04-22_11-14-54_632_6709035018023861094/-ext-10001
> Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
> Table default.liza_1 stats: [num_partitions: 0, num_files: 0, num_rows: 0, total_size:
0, raw_data_size: 0]
> MapReduce Jobs Launched: 
> Job 0:  HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 9.576 seconds
> hive (default)> create table liza_1 as
>               > select sum (key), sum(value) 
>               > from new_test;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=<number>
> Starting Job = job_201304191025_0008, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0008
> Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job  -kill job_201304191025_0008
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1
> 2013-04-22 11:22:52,200 Stage-1 map = 0%,  reduce = 0%
> 2013-04-22 11:22:55,216 Stage-1 map = 0%,  reduce = 100%
> 2013-04-22 11:22:58,234 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201304191025_0008
> Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1
> Table default.liza_1 stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size:
6, raw_data_size: 0]
> MapReduce Jobs Launched: 
> Job 0: Reduce: 1   HDFS Read: 0 HDFS Write: 6 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 11.115 seconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message