hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Rosenstrauch <dar...@darose.net>
Subject Re: how to write custom object using M/R
Date Tue, 18 Jan 2011 18:49:09 GMT
Sounds to me like your custom object isn't serializing properly.

You might want to read up on how to do it correctly here: 
http://developer.yahoo.com/hadoop/tutorial/module5.html#types

FYI - here's an example of a custom type I wrote, which I'm able to 
read/write successfully to/from a sequence file:


public class UserStateRecordWritable implements Writable {

	public UserStateRecordWritable() {
		recordType = new Text();
		recordData = new BytesWritable();
	}

	public void readFields(DataInput in) throws IOException {
		recordType.readFields(in);
		recordData.readFields(in);
	}

	public void write(DataOutput out) throws IOException {
		recordType.write(out);
		recordData.write(out);
	}

	public void set(Text newRecordType, BytesWritable newRecordData) {
		recordType.set(newRecordType);
		recordData.set(newRecordData);
	}

	public Text getRecordType() {
		return recordType;
	}

	public BytesWritable getRecordData() {
		return recordData;
	}

	public String copyRecordType() {
		return recordType.toString();
	}

	public byte[] copyRecordData() {
		return TraitWeightUtils.getBytes(recordData);
	}

	private Text recordType;
	private BytesWritable recordData;
}


HTH,

DR

On 01/14/2011 07:57 AM, Joan wrote:
> Hi,
>
> I'm trying to write (K,V) where K is a Text object and V's CustomObject. But
> It doesn't run.
>
> I'm configuring output job like: SequenceFileInputFormat so I have job with:
>
>          job.setMapOutputKeyClass(Text.class);
>          job.setMapOutputValueClass(CustomObject.class);
>          job.setOutputKeyClass(Text.class);
>          job.setOutputValueClass(CustomObject.class);
>
>          SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>
> And I obtain the next output (this is a file: part-r-00000):
>
> K  CustomObject@2b237512
> K  CustomObject@24db06de
> ...
>
> When this job finished I run other job which input is
> SequenceFileInputFormat but It doesn't run:
>
> The configuration's second job is:
>
>          job.setInputFormatClass(SequenceFileInputFormat.class);
>          SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>
> But I get an error:
>
> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000 not
> a SequenceFile
>          at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>          at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>          at
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>
>
> Can someone help me? Because I don't understand it. I don't know to save my
> object in first M/R and how to use it in second M/R
>
> Thanks
>
> Joan
>


Mime
View raw message