hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt McCline (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-17073) Incorrect result with vectorization and SharedWorkOptimizer
Date Tue, 11 Jul 2017 19:50:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16082852#comment-16082852
] 

Matt McCline edited comment on HIVE-17073 at 7/11/17 7:49 PM:
--------------------------------------------------------------

Great work!

I think you need to add some code to TableScanOperator -- it handles VectorizedRowBatch as
pass-through, too.  It has a forward call in it.  Probably add an instanceof check at beginning
of method and use it.

And, LLAP drives in VRBs, too.  Not sure where at the moment.  Might just be via InputFileFormat.




was (Author: mmccline):
Great work!

I think you need to add some code to TableScanOperator -- it handles VectorizedRowBatch as
pass-through, too.  It has a forward call in it.  Probably add an instanceof check at beginning
of method and use it.

And, LLAP drives in VRBs, too.  Not sure where at the moment.



> Incorrect result with vectorization and SharedWorkOptimizer
> -----------------------------------------------------------
>
>                 Key: HIVE-17073
>                 URL: https://issues.apache.org/jira/browse/HIVE-17073
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 3.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>         Attachments: HIVE-17073.01.patch, HIVE-17073.patch
>
>
> We get incorrect result with vectorization and multi-output Select operator created by
SharedWorkOptimizer. It can be reproduced in the following way.
> {code:title=Correct}
> select count(*) as h8_30_to_9
>   from src
>   join src1 on src.key = src1.key
>   where src1.value = "val_278";
> OK
> 2
> {code}
> {code:title=Correct}
> select count(*) as h9_to_9_30
>   from src
>   join src1 on src.key = src1.key
>   where src1.value = "val_255";
> OK
> 2
> {code}
> {code:title=Incorrect}
> select * from (
>   select count(*) as h8_30_to_9
>   from src
>   join src1 on src.key = src1.key
>   where src1.value = "val_278") s1
> join (
>   select count(*) as h9_to_9_30
>   from src
>   join src1 on src.key = src1.key
>   where src1.value = "val_255") s2;
> OK
> 2	0
> {code}
> Problem seems to be that some ds in the batch row need to be re-initialized after they
have been forwarded to each output.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message