hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johan Oskarsson <jo...@oskarsson.nu>
Subject SequenceFileOutputFormat compression codec?
Date Tue, 13 Mar 2007 17:50:21 GMT

I can't seem to find out how to set the compression codec in a 
SequenceFile if it's created when a program runs with the output format 
set to SequenceFileOutputFormat.

Another question while I'm at it.
Currently I'm using normal text files for most data. I'd like to switch 
to sequence files or map files. The program in question consists of two 
jobs, first one sums up all the data into userId, resourceId -> counter.
Second job sorts this by counter and userId and has the resourceId as 
the value. The output format then flips it around so it's still in the 
format: userId, resourceId, counter.

How would one do this sorting in a nice way if the output is a sequence 
file where I would like to keep one object called something like UserRes 
as the key and a IntWritable as the value?


View raw message