hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-3496) Query plan for multi-join where the third table joined is a subquery containing a map-only union with hive.auto.convert.join=true is wrong
Date Wed, 09 Jan 2013 10:30:39 GMT

    [ https://issues.apache.org/jira/browse/HIVE-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13548219#comment-13548219
] 

Hudson commented on HIVE-3496:
------------------------------

Integrated in Hive-trunk-hadoop2 #54 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/54/])
    HIVE-3496 Query plan for multi-join where the third table joined is a subquery containing
a map-only 
union with hive.auto.convert.join=true is wrong
(Kevin Wilfong via namit) (Revision 1388412)

     Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1388412
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
* /hive/trunk/ql/src/test/queries/clientpositive/multi_join_union.q
* /hive/trunk/ql/src/test/results/clientpositive/multi_join_union.q.out

                
> Query plan for multi-join where the third table joined is a subquery containing a map-only
union with hive.auto.convert.join=true is wrong
> ------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3496
>                 URL: https://issues.apache.org/jira/browse/HIVE-3496
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.10.0
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3496.1.patch.txt
>
>
> Take the following query as an example:
> EXPLAIN SELECT * FROM 
> src11 a JOIN
> src12 b ON (a.key = b.key) JOIN
> (SELECT * FROM (SELECT * FROM src13 UNION ALL SELECT * FROM src14)a )c ON c.value = b.value;
> When hive.auto.convert.join=true, the two joins are implemented separately as conditional
tasks with two mapjoins and a backup common join.  In the second join, the conditional task
will be a backup task, contained in the ConditionalTask, and a root task.  This is clearly
wrong, and leads to query failures.
> I've traced this to the joinUnionPlan method of GenMapRedUtils.  If the union operator
was performed in its own map reduce task and it could be a root task, when it is added to
the mapper of the existing task which performs the join in the reducer, this task will get
made a root task without first checking if the existing (non-union) task has any dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message