hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek K <hadoop.v...@gmail.com>
Subject Outputformat and RecordWriter in Hadoop Pipes
Date Tue, 13 Sep 2011 16:27:54 GMT
Hi all,

I am trying to build a Hadoop/MR application in c++ using hadoop-pipes. I
have been able to successfully work with my own mappers and reducers, but
now I need to generate output (from reducer) in a format different from the
default TextOutputFormat. I have a few questions:

(1) Similar to Hadoop streaming, is there an option to set OutputFormat in
HadoopPipes (in order to use say org.apache.hadoop.io.SequenceFile.Writer) ?
I am using Hadoop version 0.20.2.

(2) For a simple test on how to use an in-built non-default writer, I tried
the following:

     hadoop pipes -D hadoop.pipes.java.recordreader=true -D
hadoop.pipes.java.recordwriter=false -input input.seq -output output
-inputformat org.apache.hadoop.mapred.SequenceFileInputFormat -writer
org.apache.hadoop.io.SequenceFile.Writer -program my_test_program

     However this fails with a ClassNotFound exception. And if I remove the
-writer flag and use the default writer, it works just fine.

(3) Is there some example or discussion related to how to write your own
RecordWriter and run it with Hadoop-pipes ?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message