hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sankar Hariappan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-21966) Llap external client - Arrow Serializer throws ArrayIndexOutOfBoundsException in some cases
Date Tue, 09 Jul 2019 03:56:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sankar Hariappan updated HIVE-21966:
------------------------------------
       Resolution: Fixed
    Fix Version/s: 4.0.0
           Status: Resolved  (was: Patch Available)

2.patch is committed to master!
Thanks [~ShubhamChaurasia] for the contribution!

> Llap external client - Arrow Serializer throws ArrayIndexOutOfBoundsException in some
cases
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21966
>                 URL: https://issues.apache.org/jira/browse/HIVE-21966
>             Project: Hive
>          Issue Type: Bug
>          Components: llap, Serializers/Deserializers
>    Affects Versions: 3.1.1
>            Reporter: Shubham Chaurasia
>            Assignee: Shubham Chaurasia
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-21966.1.patch, HIVE-21966.2.patch
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When we submit query through llap-ext-client, arrow serializer throws ArrayIndexOutOfBoundsException
when 1),  2) and 3) below are satisfied.
> 1) {{hive.vectorized.execution.filesink.arrow.native.enabled=true}} to take arrow serializer
code path.
> 2) Query contains a filter or limit clause which enforces {{VectorizedRowBatch#selectedInUse=true}}
> 3) Projection involves a column of type {{MultiValuedColumnVector}}.
> Sample stacktrace:
> {code}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 150
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.writeGeneric(Serializer.java:679)
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.writePrimitive(Serializer.java:518)
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:276)
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.writeStruct(Serializer.java:342)
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:282)
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.writeList(Serializer.java:365)
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.write(Serializer.java:279)
> 	at org.apache.hadoop.hive.ql.io.arrow.Serializer.serializeBatch(Serializer.java:199)
> 	at org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:135)
> 	... 30 more
> {code}
> It can be reproduced by:
> from beeline:
> {code}
> CREATE TABLE complex_tbl(c1 array<struct<f1:string,f2:string>>) STORED AS
ORC;
> INSERT INTO complex_tbl SELECT ARRAY(NAMED_STRUCT('f1','v11', 'f2','v21'), NAMED_STRUCT('f1','v21',
'f2','v22'));
> {code}
> and when we fire query: {{select * from complex_tbl limit 1}} through llap-ext-client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message