hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Basnight <mbasni...@gmail.com>
Subject MapFile in mapper showing weird values
Date Fri, 08 Jan 2010 16:17:23 GMT
I have created a very small mapfile for testing purposes. When i use this file as the input
file for a hadoop job (with a SequenceFileInputFormat), i get a very strange LongWritable
value written in, which i did not write in the mapfile. I even had to go so far to change
my Mapper from <Text, Text..> to <Text, Writable, ...> to accommodate this. Im
using hadoop 0.20.0 and HadoopTestCase (Local_MR, Local_FS, 1, 1). I have not tested this
in a production env fwiw. Anywone have ideas whats going on? All replies appreciated!

Mapfile creation,

MapFile.Writer w = new MapFile.Writer(tool.getConfiguration().getJobConf(), getFileSystem(),
inputDir + "/part-00000", Text.class, Text.class);
Text t;
t = new Text("apple");
w.append(t, new Text("orange"));
t = new Text("bar");
w.append(t, new Text("foo"));
t = new Text("foo");
w.close();

Beginning of mapper class,

public class MapSortJoinMapper implements Mapper<Text, Writable, Text, Text> {

    public void map(Text key, Writable value, OutputCollector<Text, Text> textTextOutputCollector,
Reporter reporter) throws IOException {
        try {
            System.out.println(key + ":" + key.getClass());
            System.out.println(value + ":" + value.getClass());
...

Output of mapper class,

apple:class org.apache.hadoop.io.Text
orange:class org.apache.hadoop.io.Text
bar:class org.apache.hadoop.io.Text
foo:class org.apache.hadoop.io.Text
10/01/08 10:14:58 INFO mapred.MapTask: Starting flush of map output
10/01/08 10:14:59 INFO mapred.MapTask: Finished spill 0
10/01/08 10:14:59 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And
is in the process of commiting
10/01/08 10:14:59 INFO mapred.LocalJobRunner: file:/tmp/input/test_mapfile/part-00000/data:0+174
10/01/08 10:14:59 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
10/01/08 10:14:59 INFO compress.CodecPool: Got brand-new decompressor
10/01/08 10:14:59 INFO compress.CodecPool: Got brand-new decompressor
10/01/08 10:14:59 INFO compress.CodecPool: Got brand-new decompressor
10/01/08 10:14:59 INFO mapred.MapTask: numReduceTasks: 1
10/01/08 10:14:59 INFO mapred.MapTask: io.sort.mb = 100
10/01/08 10:14:59 INFO mapred.MapTask: data buffer = 79691776/99614720
10/01/08 10:14:59 INFO mapred.MapTask: record buffer = 262144/327680
apple:class org.apache.hadoop.io.Text
121:class org.apache.hadoop.io.LongWritable
Mime
View raw message