hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Rothstein <>
Subject Deserializer from a SequenceFile
Date Sun, 06 Jun 2010 18:44:45 GMT
Most of my Hadoop data is produced by Java MR jobs that store data as
custom Writable pairs in  SequenceFiles. I'm excited to bring that
data into a Hive table so that I can start building out and
prototyping more derived analytics. Can anyone point me towards a
relevant example? Since I'm just getting started I've begun with
hive-0.5.0. Thus far I've started with the RegexSerDe example and
tried to whittle it down a bit to make it into what I want but I'm
lacking context.

Since I'm not trying to take data and write it it back into these
SequenceFiles, I only need to implement the Deserializer interface,

How do I tell Hive that the underlying data InputFormat is a
SequenceFile? What's the relationship between the Writable that
arrives as the parameter to the deserialize function and the contents
of the underlying SequenceFile?

regards, Andrew

View raw message