hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Scheidtmann <jens.scheidtm...@gmail.com>
Subject Types and SequenceFiles
Date Thu, 30 May 2013 20:09:28 GMT
Dear list,

I have created a sequence file like this:

    seqWriter = SequenceFile.createWriter(fs, getConf(), new
Path(hdfsPath), IntWritable.class, BytesWritable.class,
    seqWriter.append(new IntWritable(index++), new BytesWritable(buf));

(with buf a byte array.)

Now, when reading the same sequence file in a map reduce job, I specify the
mapper like this:

    public static class NoOfMovesMapper
        extends Mapper<IntWritable, BytesWritable, IntWritable,
IntWritable> {

and configure the SequenceFile as:

    SequenceFileAsBinaryInputFormat.addInputPath(jobConf, new

This job fails with:

    java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot
be cast to org.apache.hadoop.io.IntWritable

I have to specify the mapper as

     extends Mapper<LongWritable, Text, IntWritable, IntWritable> {

to read the sequence file. But then the number of records and invocations
of the map is much larger than I would expect. I thought that I will have
as many invocations of map as records in the sequence file.

What am I doing wrong? Were am I wrong?

Thanks in advance,


View raw message