hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <>
Subject [jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.
Date Tue, 10 Feb 2015 20:22:12 GMT


Thejas M Nair commented on HIVE-9500:

Sorry about the delay in getting back. I don't think the replacement of java array for mapping
with HashMap is reasonable, in terms of performance. 
This use is in a very tight loop. Something that gets called for every record is considered
part of tight loop, and this is actually getting called for each char in with a loop on records.
So its actually within a tight loop within a tight loop. We have to be sensitive about performance
for this case.

The performance overheads of using HashMap over native array should be obvious. For one, HashMap
requires the use of Objects, instead of native types, so the memory footprint and over heads
are going to be large. The memory of the datastructure is not contiguous, there will be several
lookups needed to get to the answer of if this char needs to be escaped or not. This will
result in CPU over head as well as CPU cache misses.

> Support nested structs over 24 levels.
> --------------------------------------
>                 Key: HIVE-9500
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>              Labels: SerDe
>             Fix For: 1.2.0
>         Attachments: HIVE-9500.1.patch, HIVE-9500.2.patch, HIVE-9500.3.patch
> Customer has deeply nested avro structure and is receiving the following error when performing
> 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException:
Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level
> Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels
is set to true, while the customers have the requirement to support more than that. 
> It would be better to make the supported levels configurable or completely removed (i.e.,
we can support any number of levels). 

This message was sent by Atlassian JIRA

View raw message