hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksei (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5970) ArrayIndexOutOfBoundsException in RunLengthIntegerReaderV2.java
Date Fri, 06 Dec 2013 21:43:35 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841743#comment-13841743
] 

Aleksei commented on HIVE-5970:
-------------------------------

My findings show that there is a problem in run length encoding.
You can reproduce the problem by doing the following steps:
1. Create the table:
{code:sql}
CREATE TABLE test_orc_format(
  site STRING,
  a DOUBLE,
  b BIGINT,
  c BIGINT,
  d BIGINT,
  e DOUBLE,
  f DOUBLE,
  g DOUBLE,
  h DOUBLE,
  i DOUBLE,
  j DOUBLE,
  k BIGINT,
  l BIGINT,
  m BIGINT,
  n BIGINT,
  o BIGINT,
  p BIGINT,
  q ARRAY<DOUBLE>,
  r ARRAY<DOUBLE>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
STORED AS ORC
;
{code}
2. Load the data from attached file.
{code:sql}
load data local inpath 'test_data' overwrite into test_orc_format;
{code}
3. Use one of the following queries:
{code:sql}
select * from test_orc_format;
select o from test_orc_format;
{code}

Note, the attached file was created by hive during a job execution and not crafted by hands,
it might be wrongly encoded as well. Also, note that the query that does calculation for column
"o" cannot give negative results.

> ArrayIndexOutOfBoundsException in RunLengthIntegerReaderV2.java
> ---------------------------------------------------------------
>
>                 Key: HIVE-5970
>                 URL: https://issues.apache.org/jira/browse/HIVE-5970
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats
>    Affects Versions: 0.12.0
>            Reporter: Eric Chu
>            Priority: Critical
>              Labels: orcfile
>         Attachments: test_data
>
>
> A workload involving ORC tables starts getting the following ArrayIndexOutOfBoundsException
AFTER the upgrade to Hive 0.12. The file is added as part of HIVE-4123. 
> 2013-12-04 14:42:08,537 ERROR 
> cause:java.io.IOException: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException:
0
> 2013-12-04 14:42:08,537 WARN org.apache.hadoop.mapred.Child: Error running child
> java.io.IOException: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 0
>         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:304)
>         at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:220)
>         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:215)
>         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:200)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>         at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 0
>         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276)
>         at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
>         at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
>         at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>         at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
>         ... 11 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>         at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readPatchedBaseValues(RunLengthIntegerReaderV2.java:171)
>         at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:54)
>         at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287)
>         at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:473)
>         at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1157)
>         at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2196)
>         at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:129)
>         at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:80)
>         at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
>         ... 15 more



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message