hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken
Date Tue, 26 Nov 2013 02:11:35 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832197#comment-13832197
] 

Ashutosh Chauhan commented on HIVE-5817:
----------------------------------------

Did some testing and found testcase : vectorization_part_project.q is failing after applying
this patch with following stack trace:
{code}
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing row
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:181)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing
row
        at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
        ... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating (cdouble + 2)
        at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
        at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
        at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
        ... 5 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 13
        at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColAddLongScalar.evaluate(DoubleColAddLongScalar.java:57)
        at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
{code}

> column name to index mapping in VectorizationContext is broken
> --------------------------------------------------------------
>
>                 Key: HIVE-5817
>                 URL: https://issues.apache.org/jira/browse/HIVE-5817
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>            Reporter: Sergey Shelukhin
>            Assignee: Remus Rusanu
>            Priority: Critical
>         Attachments: HIVE-5817-uniquecols.broken.patch, HIVE-5817.00-broken.patch, HIVE-5817.4.patch
>
>
> Columns coming from different operators may have the same internal names ("_colNN").
There exists a query in the form {{select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...;}}
 (distilled from a more complex query), which runs ok w/o vectorization. With vectorization,
it will run ok for most ca, but for some ca it will fail (or can probably return incorrect
results). That is because when building column-to-VRG-index map in VectorizationContext, internal
column name for ca that the first map join operator adds to the mapping may be the same as
internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor),
and when it's time for it to output stuff, it retrieves wrong index from the map by name,
and then wrong vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message