hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: Outputformat and RecordWriter in Hadoop Pipes
Date Tue, 20 Sep 2011 22:25:09 GMT

On Tue, Sep 13, 2011 at 12:27 PM, Vivek K <hadoop.viks@gmail.com> wrote:
> Hi all,
> I am trying to build a Hadoop/MR application in c++ using hadoop-pipes. I
> have been able to successfully work with my own mappers and reducers, but
> now I need to generate output (from reducer) in a format different from the
> default TextOutputFormat. I have a few questions:
> (1) Similar to Hadoop streaming, is there an option to set OutputFormat in
> HadoopPipes (in order to use say org.apache.hadoop.io.SequenceFile.Writer) ?
> I am using Hadoop version 0.20.2.
> (2) For a simple test on how to use an in-built non-default writer, I tried
> the following:
>     hadoop pipes -D hadoop.pipes.java.recordreader=true -D
> hadoop.pipes.java.recordwriter=false -input input.seq -output output
> -inputformat org.apache.hadoop.mapred.SequenceFileInputFormat -writer
> org.apache.hadoop.io.SequenceFile.Writer -program my_test_program

-writer wants an outputformat:

      if (results.hasOption("writer")) {
        setIsJavaRecordWriter(job, true);
        job.setOutputFormat(getClass(results, "writer", job,

As such I think you want:

-writer org.apache.hadoop.mapred.SequenceFileOutputFormat

SequenceFile.Writer simply writes sequence files has nothing todo with

This is also wrong:



View raw message