hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: how to write custom object using M/R
Date Mon, 17 Jan 2011 09:13:29 GMT
1. Your first Job's OutputFormat must be set to SequenceFileOutputFormat
2. Your "custom" object must implement the Writable interface properly
(as in, the readFields() and write() methods must work as expected by
the framework and your requirements).

The fact that your output is like "K  CustomObject@2b237512" shows
that the custom object isn't serializing properly (toString() is
probably being called without a special implementation?)

On Mon, Jan 17, 2011 at 1:49 PM, Joan <joan.monplet@gmail.com> wrote:
> Hi Alain,
>
> I put it, but It didn't work.
>
> Joan
>
> 2011/1/14 MONTMORY Alain <alain.montmory@thalesgroup.com>
>>
>> Hi,
>>
>>
>>
>> I think you have to put :
>>
>>             job.setOutputFormatClass(SequenceFileOutputFormat.class);
>>
>> to make it works..
>>
>> hopes this help
>>
>>
>>
>> Alain
>>
>>
>>
>> [@@THALES GROUP RESTRICTED@@]
>>
>>
>>
>> De : Joan [mailto:joan.monplet@gmail.com]
>> Envoyé : vendredi 14 janvier 2011 13:58
>> À : mapreduce-user
>> Objet : how to write custom object using M/R
>>
>>
>>
>> Hi,
>>
>> I'm trying to write (K,V) where K is a Text object and V's CustomObject.
>> But It doesn't run.
>>
>> I'm configuring output job like: SequenceFileInputFormat so I have job
>> with:
>>
>>         job.setMapOutputKeyClass(Text.class);
>>         job.setMapOutputValueClass(CustomObject.class);
>>         job.setOutputKeyClass(Text.class);
>>         job.setOutputValueClass(CustomObject.class);
>>
>>         SequenceFileOutputFormat.setOutputPath(job, new Path("myPath");
>>
>> And I obtain the next output (this is a file: part-r-00000):
>>
>> K  CustomObject@2b237512
>> K  CustomObject@24db06de
>> ...
>>
>> When this job finished I run other job which input is
>> SequenceFileInputFormat but It doesn't run:
>>
>> The configuration's second job is:
>>
>>         job.setInputFormatClass(SequenceFileInputFormat.class);
>>         SequenceFileInputFormat.addInputPath(job, new Path("myPath"));
>>
>> But I get an error:
>>
>> java.io.IOException: hdfs://localhost:30000/user/hadoop/out/part-r-00000
>> not a SequenceFile
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1523)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1483)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1451)
>>         at
>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1432)
>>         at
>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:60)
>>
>>
>> Can someone help me? Because I don't understand it. I don't know to save
>> my object in first M/R and how to use it in second M/R
>>
>> Thanks
>>
>> Joan
>>
>>
>
>



-- 
Harsh J
www.harshj.com

Mime
View raw message