hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yaozhen Pan <itzhak....@gmail.com>
Subject Re: Multiple Output Format -Unrecognizable Characters in Output File
Date Mon, 18 Jul 2011 17:01:44 GMT
Hi James,

Not sure if you meant to write both key and value as text.
key.write(output);
This line of code writes long numbers as binary format, that might be the
reason you saw unrecognizable characters in output file.

Yaozhen

On Mon, Jul 18, 2011 at 2:00 PM, Teng, James <xteng@ebay.com> wrote:

> ** **
>
> Hi,****
>
> I encounter a problem why try to define my own MultipleOutputFormat class,
> here is the codes bellow.****
>
> *public* *class* MultipleOutputFormat *extends*FileOutputFormat<LongWritable,Text>{
> ****
>
>       *public* *class* LineWriter *extends*RecordWriter<LongWritable,Text>{
> ****
>
>             *private* DataOutputStream output;****
>
>             *private* *byte* *separatorBytes*[];****
>
>             *public* LineWriter(DataOutputStream output, String separator)
> *throws* UnsupportedEncodingException****
>
>             {****
>
>                   *this*.output=output;****
>
>                   *this*.separatorBytes=separator.getBytes("UTF-8");****
>
>             }****
>
>             @Override****
>
>             *public* *synchronized* *void* close(TaskAttemptContext
> context) *throws* IOException,****
>
>                         InterruptedException {****
>
>                   // *TODO* Auto-generated method stub****
>
>                   output.close();****
>
>             }****
>
> ** **
>
>             @Override****
>
>             *public* *void* write(LongWritable key, Text value) *throws*IOException,
> ****
>
>                         InterruptedException {****
>
>                   System.*out*.println("key:"+key.get());****
>
>                   System.*out*.println("value:"+value.toString());****
>
>                   // *TODO* Auto-generated method stub****
>
>                   //output.writeLong(key.)****
>
>                   //output.write(separatorBytes);****
>
>                   //output.write(value.toString().getBytes("UTF-8"));****
>
>                   //output.write("\n".getBytes("UTF-8"));****
>
>                   //key.write(output);****
>
>                   key.write(output);****
>
> value.write(output);****
>
> ** **
>
>                   output.write("\n".getBytes("UTF-8"));****
>
>             }****
>
>       }****
>
>       *private* Path *path*;****
>
>       *protected* String generateFileNameForKeyValue(LongWritable key,Text
> value,String name)****
>
>       {****
>
>             *return* "key"+Math.*random*();****
>
>       }****
>
> ** **
>
>       @Override****
>
>       *public* RecordWriter<LongWritable, Text> getRecordWriter(****
>
>                   TaskAttemptContext context) *throws* IOException,
> InterruptedException {****
>
>             path=*getOutputPath*(context);****
>
>             System.*out*.println(
> "ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd"
> );****
>
>             // *TODO* Auto-generated method stub****
>
>             Path file = getDefaultWorkFile(context, "");****
>
>             FileSystem fs = file.getFileSystem(context.getConfiguration());
> ****
>
> ** **
>
>             FSDataOutputStream fileOut = fs.create(file, *false*);****
>
> ** **
>
>             *return* *new* LineWriter(fileOut, "\t");****
>
> ** **
>
>       }****
>
> ** **
>
> however, there is a problem of unrecognizable characters occurrences in the
> output file,****
>
> is there any one encounter the problem before, any comment is greatly
> appreciated, thanks in advance.****
>
> ** **
>
>  ****
>
> *James, Teng (Teng Linxiao)*
>
> *eRL,   CDC,    eBay,    Shanghai*****
>
> *Extension*:        86-21-28913530****
>
> *MSN*:     tenglinxiao@hotmail.com****
>
> *Skype*:                James,Teng****
>
> *Email*:            xteng@ebay.com****
>
> ****
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message