hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-3253) ArrayIndexOutOfBounds exception for deeply nested structs
Date Wed, 03 Jul 2013 20:53:20 GMT

     [ https://issues.apache.org/jira/browse/HIVE-3253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thejas M Nair updated HIVE-3253:
--------------------------------

    Release Note: 
This change increases the number of levels of nesting supported in hive select queries. The
limitation in the serialization format used by File Output Operator for these queries (LazySimpleSerde)
was restricted the number of levels of nesting to 8 earlier, this has now been extended to
24.  This extended levels of nesting is turned on by default.

This change also improves the number of levels of nesting that you can use with tables that
use LazySimpleSerde. It uses additional control charactors as delimiters. This means that
your data should not have these charactors or you need to escape these charactors. As this
change introduces a new requirement for the way data has been written, this is not backward
compatible. Hence this is not enabled by default. To enabled this, you need to set the serde
property hive.serialization.extend.nesting.levels to true.

Look at 'ESCAPED BY' documentation for create-table, to learn how to enable escaping of the
delimiter charactors. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

Adding release note.
                
> ArrayIndexOutOfBounds exception for deeply nested structs
> ---------------------------------------------------------
>
>                 Key: HIVE-3253
>                 URL: https://issues.apache.org/jira/browse/HIVE-3253
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.9.0, 0.10.0
>            Reporter: Swarnim Kulkarni
>            Assignee: Thejas M Nair
>             Fix For: 0.12.0
>
>         Attachments: HIVE-3253.2.patch, HIVE-3253.3.patch, HIVE-3253_moar_nesting.1.patch,
jsonout.hive
>
>
> It was observed that while creating table with deeply nested structs might throw this
exception:
> {code}
> java.lang.ArrayIndexOutOfBoundsException: 9
>         at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:281)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:263)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:276)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:263)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:276)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:263)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:276)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:263)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyObjectInspector(LazyFactory.java:276)
> 	at org.apache.hadoop.hive.serde2.lazy.LazyFactory.createLazyStructInspector(LazyFactory.java:354)
> {code}
> The reason being that currently the separators array has been hardcoded to be of size
8 in the LazySimpleSerde.
> {code}
> // Read the separators: We use 8 levels of separators by default, but we
> // should change this when we allow users to specify more than 10 levels
> // of separators through DDL.
> serdeParams.separators = new byte[8];
> {code}
> If possible, we should increase this size or at least make it configurable to properly
handle deeply nested structs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message