hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Wilfong (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results
Date Fri, 18 Jan 2013 06:18:13 GMT
Kevin Wilfong created HIVE-3915:
-----------------------------------

             Summary: Union with map-only query on one side and two MR job query on the other
produces wrong results
                 Key: HIVE-3915
                 URL: https://issues.apache.org/jira/browse/HIVE-3915
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.11.0
            Reporter: Kevin Wilfong
            Assignee: Kevin Wilfong


When a query contains a union with a map only subquery on one side and a subquery involving
two sequential map reduce jobs on the other, it can produce wrong results.  It appears that
if the map only queries table scan operator is processed first the task involving a union
is made a root task.  Then when the other subquery is processed, the second map reduce job
gains the task involving the union as a child and it is made a root task.  This means that
both the first and second map reduce jobs are root tasks, so the dependency between the two
is ignored.  If they are run in parallel (i.e. the cluster has more than one node) no results
will be produced for the side of the union with the two map reduce jobs and only the results
of the other side of the union will be returned.

The order TableScan operators are processed is crucial to reproducing this bug, and it is
determined by the order values are retrieved from a map, and hence hard to predict, so it
doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message