hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikram Dixit K (JIRA)" <>
Subject [jira] [Updated] (HIVE-6711) ORC maps uses getMapSize() from MapOI which is unreliable
Date Mon, 24 Mar 2014 00:23:42 GMT


Vikram Dixit K updated HIVE-6711:

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Prasanth!

> ORC maps uses getMapSize() from MapOI which is unreliable
> ---------------------------------------------------------
>                 Key: HIVE-6711
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.11.0, 0.12.0, 0.13.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>              Labels: orcfile
>             Fix For: 0.13.0, 0.14.0
>         Attachments: HIVE-6711.1.patch
> HIVE-6707 had issues with map size. getMapSize() of LazyMap and LazyBinaryMap does not
deserialize the keys and count the number of unique keys. Since getMapSize() may return non-distinct
count of keys, the length of maps stored using ORC's map tree writer will not be in sync with
actual map size. As a result of this RLE reader will try to read beyond the disk range expecting
more map entries and will throw exception.

This message was sent by Atlassian JIRA

View raw message