hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Premal Shah (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-11693) CommonMergeJoinOperator throws exception with tez
Date Tue, 15 Nov 2016 16:46:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15662346#comment-15662346
] 

Premal Shah edited comment on HIVE-11693 at 11/15/16 4:46 PM:
--------------------------------------------------------------

[~selinazh] Is there a patch available? 

After applying the patch (https://issues.apache.org/jira/secure/attachment/12766760/HIVE-11693.1.patch),
I noticed incorrect data after a left outer join.

This query returns the right number of rows
{noformat}
SELECT COUNT(*)
FROM
    account
    LEFT JOIN contact ON contact.account_id = account.id
;
{noformat}

But, if I have a complex sub-query on the right, then I get 0 rows back
{noformat}
SELECT COUNT(*)
FROM
    account
    LEFT JOIN (
        SELECT DISTINCT
            contact.email_domain AS email_domain,
            contact.account_id
        FROM
            contact
            JOIN (
                SELECT
                    account_id,
                    COUNT(DISTINCT email_domain) AS email_domain_count
                FROM
                    contact
                GROUP BY
                    account_id
                ) email_domain_counts
                ON contact.account_id = email_domain_counts.account_id
                    AND email_domain_counts.email_domain_count = 1
        WHERE
            contact.account_id != ''
    ) contact ON contact.account_id = account.id
;
{noformat}

If, I change hive.execution.engine to mr, then it returns the number of rows in account




was (Author: premal):
[~selinazh] Is there a patch available? 

After applying the patch (https://issues.apache.org/jira/secure/attachment/12766760/HIVE-11693.1.patch),
I noticed incorrect data after a left outer join.

This query returns the right number of rows
{noformat}
SELECT COUNT(*)
FROM
    account
    LEFT JOIN contact ON contact.account_id = account.id
;
{noformat}

But, if I have a complex sub-query on the right, then I get 0 rows back
{noformat}
SELECT COUNT(*)
FROM
    account
    LEFT JOIN (
        SELECT DISTINCT
            contact.email_domain AS email_domain,
            contact.account_id
        FROM
            contact
            JOIN (
                SELECT
                    account_id,
                    COUNT(DISTINCT email_domain) AS email_domain_count
                FROM
                    contact
                GROUP BY
                    account_id
                ) email_domain_counts
                ON contact.account_id = email_domain_counts.account_id
                    AND email_domain_counts.email_domain_count = 1
        WHERE
            contact.account_id != ''
    ) contact ON contact.account_id = account.id
;
{noformat}

If, I change hive.execution.engine to mr, then it returns the number of rows in map_account



> CommonMergeJoinOperator throws exception with tez
> -------------------------------------------------
>
>                 Key: HIVE-11693
>                 URL: https://issues.apache.org/jira/browse/HIVE-11693
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Selina Zhang
>         Attachments: HIVE-11693.1.patch
>
>
> Got this when executing a simple query with latest hive build + tez latest version.
> {noformat}
> Error: Failure while running task: attempt_1439860407967_0291_2_03_000045_0:java.lang.RuntimeException:
java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException:
java.io.IOException: Please check if you are invoking moveToNext() even after it returned
false.
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
> at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
> at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
> at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException:
java.io.IOException: Please check if you are invoking moveToNext() even after it returned
false.
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:316)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException:
java.io.IOException: Please check if you are invoking moveToNext() even after it returned
false.
> at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412)
> at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:375)
> at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.doFirstFetchIfNeeded(CommonMergeJoinOperator.java:482)
> at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:434)
> at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:384)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:292)
> ... 15 more
> Caused by: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking
moveToNext() even after it returned false.
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:291)
> at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:400)
> ... 21 more
> Caused by: java.io.IOException: Please check if you are invoking moveToNext() even after
it returned false.
> at org.apache.tez.runtime.library.common.ValuesIterator.hasCompletedProcessing(ValuesIterator.java:223)
> at org.apache.tez.runtime.library.common.ValuesIterator.moveToNext(ValuesIterator.java:105)
> at org.apache.tez.runtime.library.input.OrderedGroupedKVInput$OrderedGroupedKeyValuesReader.next(OrderedGroupedKVInput.java:308)
> at org.apache.hadoop.hive.ql.exec.tez.KeyValuesFromKeyValues.next(KeyValuesFromKeyValues.java:46)
> at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:249)
> ... 22 more
> {noformat}
> Not sure if this is related to HIVE-11016. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message