hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Chu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-6210) Default serde for RCFile has changed
Date Thu, 16 Jan 2014 03:38:20 GMT
Eric Chu created HIVE-6210:
------------------------------

             Summary: Default serde for RCFile has changed
                 Key: HIVE-6210
                 URL: https://issues.apache.org/jira/browse/HIVE-6210
             Project: Hive
          Issue Type: Bug
          Components: File Formats
    Affects Versions: 0.12.0
            Reporter: Eric Chu


In Hive 10 when I create a table in RCFile, the serde is 
org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe

In Hive 12 when I do the same thing, the serde becomes org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe

Similarly, in Hive 12, when I set FILEFORMAT to RCFILE, the serde will become LazyBinaryColumnarSerDe,
as opposed to ColumnarSerDe in previous versions. What is the reason behind a change? This
seems like a regression bug to me.

Normally, we can work around the issue by explicitly setting the table serde to be org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.
However, this causes a problem for our migration to ORC. Specifically, we have a partitioned
table for which we want the new partitions to have locations pointing to ORC partitions, and
the old partitions to have locations pointing to RCFILE partitions. Moreover, we need the
ability to change the location of a partition to point to RCFILE partition. For this we'd
do so by doing SET FILEFORMAT RCFILE. However, b/c of this serde problem the RCFile partition
in an ORC table will have the wrong serde, and ALTER TABLE doesn't allow us to set serde for
a partition. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message