hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joydeep Sen Sarma <jssa...@facebook.com>
Subject RE: Serde and Record I/O
Date Mon, 08 Dec 2008 18:39:29 GMT
Hi Johan - so keys and value class types are RecordIO classes?

This may need some dev work. A few things:
- traditionally our serde's have ignored the keys altogether (the row is embedded in the value).
What are the semantics for ur case?
- the jute code was written for an older version of the serde interface and needs to be ported
to the new interface
- finally - i am not sure about the current jute code (I am looking at it and the deserialization
code is not making sense to me)

+1 on supporting this - please file a Jira - should be very easy to get this in.

-----Original Message-----
From: Johan Oskarsson [mailto:johan@oskarsson.nu] 
Sent: Monday, December 08, 2008 10:16 AM
To: hive-user@hadoop.apache.org
Subject: Serde and Record I/O

We store a lot of data in SequenceFiles with the key and value as
generated Jute/RecordIO files and would want to process it all using Hive.

I noticed that there is a serde/jute package, but I assume serde version
1 is deprecated in favour of serde2? Either way I get a class cast
exception if I try to use it.

I've looked through the mailinglist and wiki but can't find a good
example on how to process sequencefiles with recordio key/value classes.
Any help would be much appreciated.

/Johan

Mime
View raw message