hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Peña (JIRA) <j...@apache.org>
Subject [jira] [Commented] (HIVE-14800) Handle off by 3 in ORC split generation based on split strategy used
Date Sun, 20 Nov 2016 17:33:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15681524#comment-15681524
] 

Sergio Peña commented on HIVE-14800:
------------------------------------

I'm moving this JIRA out of 2.1.1 release as it is not a blocker nor critical for a 2.1.1
RC version. Feel free to commit it to branch 2.1 if the patch is ready before the release.

> Handle off by 3 in ORC split generation based on split strategy used
> --------------------------------------------------------------------
>
>                 Key: HIVE-14800
>                 URL: https://issues.apache.org/jira/browse/HIVE-14800
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>
> BI will apparently generate splits starting at offset 0.
> ETL will skip the ORC header and generate a split starting at offset 3.
> There's a workaround in the HiveSplitGenreator to handle this for consistent splits.
Ideally, Orc split generation should take care of this.
> cc [~prasanth_j], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message