hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saliya Ekanayake <esal...@gmail.com>
Subject Re: Writing Bytes from Map and Reduce functions
Date Thu, 01 Apr 2010 17:38:26 GMT
Hi Jiamin,

I am thankful for the previous feedback provided by you. In fact I was able
to solve the problem by writing a custom OutputFormat, which simply writes
the byte values I want.

Regards,
Saliya

On Wed, Mar 31, 2010 at 12:40 PM, Saliya Ekanayake <esaliya@gmail.com>wrote:

> Hi Jiamin,
>
> Thank you once again. Let me explain a bit on my scenario. I am using
> Amazon Elastic MapReduce. So the output file is written to some folder
> inside S3.
>
> I have only a single reduce task and inside that I do,
>
> byte[] bytes = some-code-to-generate-bytes
> output.collect(new Text("key"), new BytesWritable(bytes));
>
> In the main method I have set the outputformat of the job configuration as
> SequenceFileOutputFormat.
>
>
> Now when I run this it creates a file in the given S3 output directory as
> expected. I have a java client in my local machine which downloads this file
> from S3 and tries to read it. The issue comes when reading this file,
> because I am not sure how can I read this file to get the original set of
> bytes I wrote from the reduce task. I looked into the
> SequenceFileOutputFormat and it seems that this file contains a header and
> body. So do I have to manually read it as bytes and extract out the portion
> that I need or is there a built in API class to read such file?
>
> Thank you
> Saliya
>
>
> On Wed, Mar 31, 2010 at 9:35 AM, welman Lu <welmanwenzi@gmail.com> wrote:
>
>> Hi, Saliya,
>>
>> If you said the part files, I think you are talking about the results of
>> the reduce function that stored inside the HDFS, right?
>> If so, I think this example in "Hadoop The Definitive Guide" can help you.
>>
>> -------------
>> Example 3-1. Displaying files from a Hadoop filesystem on standard output
>> using a
>> URLStreamHandler
>> public class URLCat {
>>   static {
>>     URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
>>   }
>>
>>   public static void main(String[] args) throws Exception {
>>     InputStream in = null;
>>     try {
>>       in = new URL(args[0]).openStream();
>>       IOUtils.copyBytes(in, System.out, 4096, false);
>>     } finally {
>>       IOUtils.closeStream(in);
>>     }
>>   }
>> }
>>
>> Take a try, good luck!
>>
>>
>> Best Regards
>> Jiamin Lu
>
>
>
>
> --
> Saliya Ekanayake
> http://www.esaliya.blogspot.com
> http://www.esaliya.wordpress.com
>



-- 
Saliya Ekanayake
http://www.esaliya.blogspot.com
http://www.esaliya.wordpress.com

Mime
View raw message