hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward Capriolo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4002) Fetch task aggregation for simple group by query
Date Sun, 25 Aug 2013 18:06:51 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749714#comment-13749714
] 

Edward Capriolo commented on HIVE-4002:
---------------------------------------

{quote}
[edward@jackintosh hive-trunk]$ patch -p0 < D8739\?download\=true 
patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/MuxOperator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/UDTFOperator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java
patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchAggregation.java
patching file ql/src/java/org/apache/hadoop/hive/ql/optimizer/SimpleFetchOptimizer.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java
Hunk #3 succeeded at 119 (offset 9 lines).
Hunk #4 succeeded at 679 (offset 26 lines).
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/RowResolver.java
patching file ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Hunk #1 succeeded at 3503 (offset -19 lines).
Hunk #2 succeeded at 3609 (offset -19 lines).
Hunk #3 succeeded at 3622 (offset -19 lines).
Hunk #4 succeeded at 3634 (offset -19 lines).
Hunk #5 succeeded at 3684 (offset -19 lines).
Hunk #6 succeeded at 3713 (offset -19 lines).
Hunk #7 succeeded at 3820 (offset -19 lines).
Hunk #8 succeeded at 6964 (offset -18 lines).
Hunk #9 succeeded at 6990 (offset -18 lines).
patching file ql/src/test/queries/clientpositive/fetch_aggregation.q
patching file ql/src/test/results/clientpositive/fetch_aggregation.q.out
patching file ql/src/test/results/compiler/plan/groupby1.q.xml
Hunk #5 succeeded at 1312 (offset -10 lines).
Hunk #6 succeeded at 1326 (offset -10 lines).
Hunk #7 succeeded at 1345 (offset -10 lines).
Hunk #8 succeeded at 1426 (offset -10 lines).
Hunk #9 succeeded at 1478 (offset -10 lines).
patching file ql/src/test/results/compiler/plan/groupby2.q.xml
Hunk #10 succeeded at 1087 (offset -10 lines).
Hunk #11 succeeded at 1428 (offset -10 lines).
Hunk #12 succeeded at 1482 (offset -10 lines).
Hunk #13 succeeded at 1508 (offset -10 lines).
Hunk #14 succeeded at 1541 (offset -10 lines).
Hunk #15 succeeded at 1618 (offset -10 lines).
Hunk #16 succeeded at 1647 (offset -10 lines).
Hunk #17 succeeded at 1715 (offset -10 lines).
Hunk #18 succeeded at 1734 (offset -10 lines).
Hunk #19 succeeded at 1819 (offset -10 lines).
Hunk #20 succeeded at 1832 (offset -10 lines).
patching file ql/src/test/results/compiler/plan/groupby3.q.xml
Hunk #8 succeeded at 1299 (offset -7 lines).
Hunk #9 succeeded at 1627 (offset -7 lines).
Hunk #10 succeeded at 1640 (offset -7 lines).
Hunk #11 succeeded at 1653 (offset -7 lines).
Hunk #12 succeeded at 1695 (offset -7 lines).
Hunk #13 succeeded at 1709 (offset -7 lines).
Hunk #14 succeeded at 1723 (offset -7 lines).
Hunk #15 succeeded at 1770 (offset -7 lines).
Hunk #16 succeeded at 1846 (offset -7 lines).
Hunk #17 succeeded at 1859 (offset -7 lines).
Hunk #18 succeeded at 1872 (offset -7 lines).
Hunk #19 succeeded at 1938 (offset -7 lines).
Hunk #20 succeeded at 2144 (offset -7 lines).
Hunk #21 succeeded at 2157 (offset -7 lines).
Hunk #22 succeeded at 2170 (offset -7 lines).
patching file ql/src/test/results/compiler/plan/groupby5.q.xml
Hunk #5 succeeded at 1175 (offset -10 lines).
Hunk #6 succeeded at 1189 (offset -10 lines).
Hunk #7 succeeded at 1208 (offset -10 lines).
Hunk #8 succeeded at 1295 (offset -10 lines).
Hunk #9 succeeded at 1347 (offset -10 lines).
patching file serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java

{quote}

THis did not patch perfectly clean. Running test now manually.
                
> Fetch task aggregation for simple group by query
> ------------------------------------------------
>
>                 Key: HIVE-4002
>                 URL: https://issues.apache.org/jira/browse/HIVE-4002
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-4002.D8739.1.patch, HIVE-4002.D8739.2.patch, HIVE-4002.D8739.3.patch
>
>
> Aggregation queries with no group-by clause (for example, select count(*) from src) executes
final aggregation in single reduce task. But it's too small even for single reducer because
the most of UDAF generates just single row for map aggregation. If final fetch task can aggregate
outputs from map tasks, shuffling time can be removed.
> This optimization transforms operator tree something like,
> TS-FIL-SEL-GBY1-RS-GBY2-SEL-FS + FETCH-TASK
> into 
> TS-FIL-SEL-GBY1-FS + FETCH-TASK(GBY2-SEL-LS)
> With the patch, time taken for auto_join_filters.q test reduced to 6 min (10 min, before).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message