Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 262007379 for ; Mon, 18 Jul 2011 06:01:35 +0000 (UTC) Received: (qmail 1983 invoked by uid 500); 18 Jul 2011 06:01:30 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 1417 invoked by uid 500); 18 Jul 2011 06:01:19 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 1371 invoked by uid 99); 18 Jul 2011 06:01:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jul 2011 06:01:15 +0000 X-ASF-Spam-Status: No, hits=-2.5 required=5.0 tests=HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_HI,T_FRT_BELOW2 X-Spam-Check-By: apache.org Received-SPF: unknown (athena.apache.org: error in processing during lookup of xteng@ebay.com) Received: from [216.33.244.7] (HELO rhv-mipot-002.corp.ebay.com) (216.33.244.7) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Jul 2011 06:01:08 +0000 DomainKey-Signature: s=corp; d=ebay.com; c=nofws; q=dns; h=X-EBay-Corp:X-IronPort-AV:Received:Received:From:To:Date: Subject:Thread-Topic:Thread-Index:Message-ID: Accept-Language:Content-Language:X-MS-Has-Attach: X-MS-TNEF-Correlator:acceptlanguage:x-ems-proccessed: x-ems-stamp:Content-Type:MIME-Version:X-CFilter; b=JBGYpSNMyZbRLDY+dB81bN9YCH1P+W4mEPm6KgEEln03P9OdfCTZzfTI gtwAIfi3mbCKMb4Uc+J0lQYn6tSr1C1d1PczZAKRzivo2nIXK3ucN+How e9CHlhUkEXbwjss; DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=ebay.com; i=xteng@ebay.com; q=dns/txt; s=corp; t=1310968869; x=1342504869; h=from:to:date:subject:message-id:mime-version; bh=bufOU4DewsdwzQE+zGbd/IhhiMN0is5rE4Up0Nug7aU=; b=luGiMNaOhoUt5jXVish3orZHT+4GEA+3taNCvvwVU+kSI14Kr/JLQdFT T6lCzsOSQA1YuH5uu0Dj18ybOpemixAejKJFHQiR4A857zyRj4vL4e8DU 3Lebp0H9ZOxbX1v; X-EBay-Corp: Yes X-IronPort-AV: E=Sophos;i="4.67,221,1309762800"; d="gif'147?jpg'147,145?scan'147,145,208,217,147,145";a="24109918" Received: from rhv-vtenf-001.corp.ebay.com (HELO RHV-MEXHT-003.corp.ebay.com) ([10.112.113.52]) by rhv-mipot-002.corp.ebay.com with ESMTP; 17 Jul 2011 23:00:46 -0700 Received: from RHV-MEXMS-001.corp.ebay.com ([10.245.17.116]) by RHV-MEXHT-003.corp.ebay.com ([10.245.24.102]) with mapi; Sun, 17 Jul 2011 23:00:45 -0700 From: "Teng, James" To: "common-user@hadoop.apache.org" Date: Sun, 17 Jul 2011 23:00:42 -0700 Subject: Multiple Output Format -Unrecognizable Characters in Output File Thread-Topic: Multiple Output Format -Unrecognizable Characters in Output File Thread-Index: AcxFEAa5mK71Rv3fSQKTAqoAGlYk8g== Message-ID: <8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737@RHV-MEXMS-001.corp.ebay.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: acceptlanguage: zh-CN, en-US x-ems-proccessed: 10SqDH0iR7ekR7SRpKqm5A== x-ems-stamp: inSzo41C+kgmleGqcbj2Aw== Content-Type: multipart/related; boundary="_005_8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737RHVMEXMS001co_"; type="multipart/alternative" MIME-Version: 1.0 X-CFilter: Scanned --_005_8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737RHVMEXMS001co_ Content-Type: multipart/alternative; boundary="_000_8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737RHVMEXMS001co_" --_000_8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737RHVMEXMS001co_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, I encounter a problem why try to define my own MultipleOutputFormat class, = here is the codes bellow. public class MultipleOutputFormat extends FileOutputFormat{ public class LineWriter extends RecordWriter{ private DataOutputStream output; private byte separatorBytes[]; public LineWriter(DataOutputStream output, String separator) th= rows UnsupportedEncodingException { this.output=3Doutput; this.separatorBytes=3Dseparator.getBytes("UTF-8"); } @Override public synchronized void close(TaskAttemptContext context) thro= ws IOException, InterruptedException { // TODO Auto-generated method stub output.close(); } @Override public void write(LongWritable key, Text value) throws IOExcept= ion, InterruptedException { System.out.println("key:"+key.get()); System.out.println("value:"+value.toString()); // TODO Auto-generated method stub //output.writeLong(key.) //output.write(separatorBytes); //output.write(value.toString().getBytes("UTF-8")); //output.write("\n".getBytes("UTF-8")); //key.write(output); key.write(output); value.write(output); output.write("\n".getBytes("UTF-8")); } } private Path path; protected String generateFileNameForKeyValue(LongWritable key,Text va= lue,String name) { return "key"+Math.random(); } @Override public RecordWriter getRecordWriter( TaskAttemptContext context) throws IOException, Interrupt= edException { path=3DgetOutputPath(context); System.out.println("ddddddddddddddddddddddddddddddddddddddddddd= dddddddddddddddddddddddddddddddddd"); // TODO Auto-generated method stub Path file =3D getDefaultWorkFile(context, ""); FileSystem fs =3D file.getFileSystem(context.getConfiguration()= ); FSDataOutputStream fileOut =3D fs.create(file, false); return new LineWriter(fileOut, "\t"); } however, there is a problem of unrecognizable characters occurrences in the= output file, is there any one encounter the problem before, any comment is greatly appre= ciated, thanks in advance. James, Teng (Teng Linxiao) eRL, CDC, eBay, Shanghai Extension: 86-21-28913530 MSN: tenglinxiao@hotmail.com Skype: James,Teng Email: xteng@ebay.com [cid:image002.gif@01CC4553.143F5A00] --_000_8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737RHVMEXMS001co_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

I encounte= r a problem why try to define my own MultipleOutputFormat class, here is the co= des bellow.

public class MultipleOutputFormat extends FileOutputFormat<LongWritable,Text>{

      public class LineWriter extends RecordWriter<LongWritable,Text>{

          &n= bsp; private DataOutputStream output;

          &n= bsp; private byte separato= rBytes[];

          &n= bsp; public LineWriter(DataOutputStream output, String separator) throws UnsupportedEncodingException

          &n= bsp; {

          &n= bsp;       this.output=3Doutput;=

          &n= bsp;       this.separatorBytes=3Dseparator.getBytes("UTF-8");

          &n= bsp; }

          &n= bsp; @Override

          &n= bsp; public synchronized void close(TaskAttemptContext context) throws IOException,

          &n= bsp;            = ; InterruptedException {

          &n= bsp;       // TODO Auto-generated method stub=

          &n= bsp;       output.close();

          &n= bsp; }

 

          &n= bsp; @Override

          &n= bsp; public void write(LongWritable key, Text value) throws IOException,

          &n= bsp;            = ; InterruptedException {

          &n= bsp;       System.out.println("key:"+key.get());=

          &n= bsp;       System.out.println("value:"+value.toString());

          &n= bsp;       // TODO Auto-generated method stub=

          &n= bsp;       //output.writeLong(key.)

          &n= bsp;       //output.write(separatorBytes);

          &n= bsp;       //output.write(value.toString().ge= tBytes("UTF-8"));

          &n= bsp;       //output.write("\n".getB= ytes("UTF-8"));

          &n= bsp;       //key.write(output);=

          &n= bsp;       key.write(output);

value.write(output);

 

          &n= bsp;       output.write("\n".getBytes("UTF-8"));

          &n= bsp; }

      }

      private Path path;

      protected String generateFileNameForKeyValue(LongWritable key,Text value,String name)

      {

          &n= bsp; return "key"+Math.random()= ;

      }

 

      @Override

      public RecordWriter<LongWritable, Text> getRecordWriter(

          &n= bsp;       TaskAttemptContext context) throws IOExcepti= on, InterruptedException {

          &n= bsp; path=3DgetOutputPath(context);

            = System.out.println("dddddddddddddddddddddddddddddddddddddddddddddddddddddd= ddddddddddddddddddddddd");

          &n= bsp; // TODO Auto-generated method stub=

          &n= bsp; Path file =3D getDefaultWorkFile(context, "");

          &n= bsp; FileSystem fs =3D file.getFileSystem(context.getConfiguration());

 

          &n= bsp; FSDataOutputStream fileOut =3D fs.create(file, false);

 

          &n= bsp; return new LineWriter(fileOut, "\t");

 

 &nbs= p;    }

<= span style=3D'font-size: 12.0pt;color:black'> 

however, t= here is a problem of unrecognizable characters occurrences in the output file,

is there a= ny one encounter the problem before, any comment is greatly appreciated, thanks in advance.

 = ;

 

James, Teng= (Teng Linxiao)

eRL,   CDC,    eBay,    Shanghai

Extension:        86-21= -28913530

MSN:     tenglinxiao@hotmail.com=

Skype:        =         James,Teng

Email:        =     xteng@ebay.com

--_000_8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737RHVMEXMS001co_-- --_005_8F21C7DCD2154843BA2AAD0480FB0ECA7B42603737RHVMEXMS001co_--