hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC
Date Mon, 23 May 2016 18:10:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296784#comment-15296784
] 

Sergey Shelukhin commented on HIVE-9660:
----------------------------------------

[~owen.omalley] lots of ORC tests failed that may be related... also it looks like all the
Tez tests got stuck, not sure if that's related or just HiveQA (they didn't get stuck in other
jiras though)

> store end offset of compressed data for RG in RowIndex in ORC
> -------------------------------------------------------------
>
>                 Key: HIVE-9660
>                 URL: https://issues.apache.org/jira/browse/HIVE-9660
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-9660.01.patch, HIVE-9660.02.patch, HIVE-9660.03.patch, HIVE-9660.04.patch,
HIVE-9660.05.patch, HIVE-9660.06.patch, HIVE-9660.07.patch, HIVE-9660.07.patch, HIVE-9660.08.patch,
HIVE-9660.09.patch, HIVE-9660.10.patch, HIVE-9660.10.patch, HIVE-9660.11.patch, HIVE-9660.patch,
HIVE-9660.patch, HIVE-9660.patch, owen-hive-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of extra data
being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of compressed
buffers for each RG, or end offset, or something, to remove this estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message