hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <alanfga...@gmail.com>
Subject Re: transactional table + vectorization + where = bug
Date Mon, 21 Sep 2015 16:37:35 GMT
I am not aware of this issue.  Please file a JIRA, and if it does turn 
out to be a duplicate we can mark it as such.

Alan.

> Furcy Pin <mailto:furcy.pin@flaminem.com>
> September 19, 2015 at 2:36
> Hi,
>
>
> We bumped into a bug when using vectorization on a transactional table.
> Here is a minimal example :
>
> DROP TABLE IF EXISTS vectorization_transactional_test ;
> CREATE TABLE vectorization_transactional_test (
>     id INT
> )
> CLUSTERED BY (id) into 3 buckets
> STORED AS ORC
> TBLPROPERTIES('transactional'='true') ;
>
> INSERT INTO TABLE vectorization_transactional_test values
> (1)
> ;
>
> SET hive.vectorized.execution.enabled=true ;
>
> SELECT
> *
> FROM vectorization_transactional_test
> WHERE id = 1
> ;
>
> With vectorization enable, the last query will fail with a n 
> ArrayOutOfBoundException in the mappers.
> Here is the full stack:
>
> FATAL [main] org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
> while processing row
>     at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
>     at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:415)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating 1
>     at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:126)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>     at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:111)
>     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>     at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>     at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>     at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
>     ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
>     at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102)
>     at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150)
>     at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:124)
>     ... 15 more
>
>
> Of course, disabling vectorization removes the bug.
>
> More annoyingly, when the table is used in a JOIN, the job doesn't 
> fail but returns a wrong result instead :
> for instance an empty table, while disabling vectorization returns a 
> non-empty one. This behavior is harder to reproduce with a minimal 
> example.
>
> We experienced this bug in version 1.1.0-cdh5.4.2.
>
> We didn't find any JIRA related to this, is it a known bug, or should 
> we create a new JIRA?
>
> Best,
>
> Furcy
>
>
>
>
>

Mime
View raw message