Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0EFB58DC7 for ; Tue, 13 Sep 2011 16:28:25 +0000 (UTC) Received: (qmail 15725 invoked by uid 500); 13 Sep 2011 16:28:22 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 15675 invoked by uid 500); 13 Sep 2011 16:28:22 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 15667 invoked by uid 99); 13 Sep 2011 16:28:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Sep 2011 16:28:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of hadoop.viks@gmail.com designates 209.85.215.176 as permitted sender) Received: from [209.85.215.176] (HELO mail-ey0-f176.google.com) (209.85.215.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Sep 2011 16:28:15 +0000 Received: by eyz10 with SMTP id 10so544988eyz.35 for ; Tue, 13 Sep 2011 09:27:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=DFNydg8ofGozyQtFbb8fonxVNw3I0Clk2Wgc49ghYv0=; b=tHMY4A7reRxEq+0Bsjpaw8K1gxotEsAH3B0w0z7kObAEHDVNMx4Eaa4/xSiUHvaLJd bkNePo+9usDlVscf18WV+31E1PgIsLYA3ruAMQEmGR3R1VmwHIARq5c/2qpsnXUit5Zk TUQ0WtisZTujtKY/hXDrDZvIoqx1pvwV7B3KU= MIME-Version: 1.0 Received: by 10.52.21.175 with SMTP id w15mr1579340vde.53.1315931274090; Tue, 13 Sep 2011 09:27:54 -0700 (PDT) Received: by 10.52.183.105 with HTTP; Tue, 13 Sep 2011 09:27:54 -0700 (PDT) Date: Tue, 13 Sep 2011 12:27:54 -0400 Message-ID: Subject: Outputformat and RecordWriter in Hadoop Pipes From: Vivek K To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf307c9bba2416f704acd520dc --20cf307c9bba2416f704acd520dc Content-Type: text/plain; charset=ISO-8859-1 Hi all, I am trying to build a Hadoop/MR application in c++ using hadoop-pipes. I have been able to successfully work with my own mappers and reducers, but now I need to generate output (from reducer) in a format different from the default TextOutputFormat. I have a few questions: (1) Similar to Hadoop streaming, is there an option to set OutputFormat in HadoopPipes (in order to use say org.apache.hadoop.io.SequenceFile.Writer) ? I am using Hadoop version 0.20.2. (2) For a simple test on how to use an in-built non-default writer, I tried the following: hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=false -input input.seq -output output -inputformat org.apache.hadoop.mapred.SequenceFileInputFormat -writer org.apache.hadoop.io.SequenceFile.Writer -program my_test_program However this fails with a ClassNotFound exception. And if I remove the -writer flag and use the default writer, it works just fine. (3) Is there some example or discussion related to how to write your own RecordWriter and run it with Hadoop-pipes ? Thanks. Best, Vivek -- --20cf307c9bba2416f704acd520dc--