hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Chu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6210) Default serde for RCFile has changed
Date Thu, 16 Jan 2014 04:23:19 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873022#comment-13873022
] 

Eric Chu commented on HIVE-6210:
--------------------------------

Clarification:
1) I can set serde for partition. The error I got was due to not having quotes around the
serde value
2) The change comes from HIVE-4475. 

So this is not a bug after all but actually a feature. I hope Hive release notes would mention
breaking changes like this. Otherwise it's very hard to notice these things. For example,
we experienced a correctness bug after upgrading to Hive 11 that forced us to revert to Hive
10 until the bug is fixed. And the next time before we upgraded to 12 we spent a month doing
upgrade testing, but still we couldn't catch this. 

> Default serde for RCFile has changed
> ------------------------------------
>
>                 Key: HIVE-6210
>                 URL: https://issues.apache.org/jira/browse/HIVE-6210
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats
>    Affects Versions: 0.12.0
>            Reporter: Eric Chu
>
> In Hive 10 when I create a table in RCFile, the serde is 
> org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
> In Hive 12 when I do the same thing, the serde becomes org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe
> Similarly, in Hive 12, when I set FILEFORMAT to RCFILE, the serde will become LazyBinaryColumnarSerDe,
as opposed to ColumnarSerDe in previous versions. What is the reason behind this change? This
seems like a regression bug to me.
> Normally, we can work around the issue by explicitly setting the table serde to be org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.
However, this causes a problem for our migration to ORC. Specifically, we have a partitioned
table for which we want the new partitions to have locations pointing to ORC partitions, and
the old partitions to have locations pointing to RCFILE partitions. Moreover, we need the
ability to change the location of a partition to point to RCFILE partition. For this we'd
do so by doing SET FILEFORMAT RCFILE. However, b/c of this serde problem the RCFile partition
in an ORC table will have the wrong serde, and ALTER TABLE doesn't allow us to set serde for
a partition. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message