hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (Commented) (JIRA)" <>
Subject [jira] [Commented] (HIVE-2711) Make the header of RCFile unique
Date Mon, 26 Mar 2012 21:26:26 GMT


Owen O'Malley commented on HIVE-2711:

The compatibility is very fragile in that SequenceFile can only read an RCFile if they have
the RCFile class since the key and value classes are contained in Additionally,
SequenceFile would present the RCFile blocks instead of the RCFile rows.

Furthermore, in the future if either RCFile or SequenceFile has a new format revision, it
breaks badly.
> Make the header of RCFile unique
> --------------------------------
>                 Key: HIVE-2711
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: HIVE-2711.D2115.1.patch
> The RCFile implementation was copied from Hadoop's SequenceFile and copied the 'magic'
string in the header. This means that you can't use the header to distinguish between RCFiles
and SequenceFiles.
> I'd propose that we create a new header for RCFiles (RCF?) to replace the current SEQ.
To maintain compatibility, we'll need to continue to accept the current 'SEQ\06' and just
make new files contain the new header.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message