hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ives Aerts <ives.aerts+had...@gmail.com>
Subject Re: Help needed in sequence file manipulation
Date Fri, 18 Dec 2009 16:27:16 GMT
On Fri, Dec 18, 2009 at 5:20 PM, Cao Kang <cakang@clarku.edu> wrote:
> Is there any example how a sequence file can be read and split in hadoop?
> Many thanks!

That should be fairly easy. The following code reads all entries in a
sequence file:

        SequenceFile.Reader reader = new
SequenceFile.Reader(path.getFileSystem(config), path, config);

        Writable key = (Writable)reader.getKeyClass().newInstance();
        Writable value = (Writable)reader.getValueClass().newInstance();

        while(reader.next(key, value)) {
            System.out.println(key + "\t" + value);
        }

        reader.close();

Add some logic to partition the entries and write them out using a
SequenceFile.Writer.

-- 
Cheers,
-Ives

Mime
View raw message