hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sudhakara st <sudhakara...@gmail.com>
Subject Re: running mapreduce on different filesystems as input and output locations
Date Fri, 31 Mar 2017 17:45:09 GMT
It not possible write to S3 if use context.write(), but it possible when
you open a s3 file in reducer and write. Create output stream to a S3 file
in reducer *setup() *method like

    FSDataOutputStream fsStream = FileSystem.create(to s3);
    PrintWriter writer  = new PrintWriter(fsStream);

write reducer output to this stream in *reduce ()*
    writer.write("key: " + key + "value: "+ value);

and close the stream in *close()* method
    writer.close()

It create one file for each reducer, you may use reducer id as file name in
s3.

Regards,
Sudhakara

On Tue, Mar 28, 2017 at 6:03 AM, Jae-Hyuck Kwak <jhkwak@kisti.re.kr> wrote:

> Hi,
>
> I want to run mapreduce on different filesystems as input and output
> locations.
>
> # hadoop jar examples.jar wordcount hdfs://input s3://output
>
> Is it possible?
>
> any kinds of comments will be welcome.
>
> Best regards,
> Jae-Hyuck
>
>



-- 

Regards,
...sudhakara

Mime
View raw message