hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruben de Vries <ruben.devr...@hyves.nl>
Subject RE: using the key from a SequenceFile
Date Thu, 19 Apr 2012 15:49:12 GMT
You're a lifesaver!

From: Dilip Joseph [mailto:dilip.antony.joseph@gmail.com]
Sent: Thursday, April 19, 2012 5:47 PM
To: user@hive.apache.org
Subject: Re: using the key from a SequenceFile

An example input format for using SequenceFile keys in hive is at https://gist.github.com/2421795
.  The code just reverses how the key and value are accessed in the standard SequenceFileRecordRecorder
and SequenceFileInputFormat that comes with hadoop.

You can use this custom input format by specifying the following when you create the table:

STORED AS
    INPUTFORMAT 'com.mycompany.SequenceFileKeyInputFormat'

Dilip

On Thu, Apr 19, 2012 at 6:09 AM, Owen O'Malley <omalley@apache.org<mailto:omalley@apache.org>>
wrote:
On Thu, Apr 19, 2012 at 3:07 AM, Ruben de Vries <ruben.devries@hyves.nl<mailto:ruben.devries@hyves.nl>>
wrote:
> I'm trying to migrate a part of our current hadoop jobs from normal
> mapreduce jobs to hive,
>
> Previously the data was stored in sequencefiles with the keys containing
> valueable data!
I think you'll want to define your table using a custom InputFormat
that creates a virtual row based on both the key and value and then
use the 'STORED AS INPUTFORMAT ...'

-- Owen



--
_________________________________________
Dilip Antony Joseph
http://csgrad.blogspot.com
http://www.marydilip.info

Mime
View raw message