hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Furcy Pin (JIRA)" <>
Subject [jira] [Created] (HIVE-11933) transactional table + vectorization + where = bug
Date Wed, 23 Sep 2015 15:59:04 GMT
Furcy Pin created HIVE-11933:

             Summary: transactional table + vectorization + where = bug
                 Key: HIVE-11933
             Project: Hive
          Issue Type: Bug
    Affects Versions: 1.1.0
            Reporter: Furcy Pin

We bumped into a bug when using vectorization on a transactional table.
Here is a minimal example :

DROP TABLE IF EXISTS vectorization_transactional_test ;
CREATE TABLE vectorization_transactional_test (
    id INT
CLUSTERED BY (id) into 3 buckets
TBLPROPERTIES('transactional'='true') ;

INSERT INTO TABLE vectorization_transactional_test values 

SET hive.vectorized.execution.enabled=true ;

FROM vectorization_transactional_test
WHERE id = 1

With vectorization enable, the last query will fail with a n ArrayOutOfBoundException in the
Here is the full stack:

FATAL [main] org.apache.hadoop.hive.ql.metadata.HiveException:
Hive Runtime Error while processing row 
    at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(
    at org.apache.hadoop.mapred.MapTask.runOldMapper(
    at org.apache.hadoop.mapred.YarnChild$
    at Method)
    at org.apache.hadoop.mapred.YarnChild.main(
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 1
    at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(
    at org.apache.hadoop.hive.ql.exec.Operator.forward(
    at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(
    at org.apache.hadoop.hive.ql.exec.Operator.forward(
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(
    at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(
    at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(
    ... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
    at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(
    at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(
    at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(
    ... 15 more

Of course, disabling vectorization (or transactionnality) removes the bug.

More annoyingly, when the table is used in a JOIN, the job doesn't fail but returns a wrong
result instead :
for instance an empty table, while disabling vectorization returns a non-empty one. This behavior
is harder to reproduce with a minimal example.

We experienced this bug in version 1.1.0-cdh5.4.2.

I did not achieve to reproduce this bug on a local build of hive 1.2.0
because I did not succeed to have transactionnality working correctly.
I guess it only works in pseudo-distributed mode and not in local mode.

This message was sent by Atlassian JIRA

View raw message