hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Melendez <...@supertom.com>
Subject Re: Custom FileOutputFormat / RecordWriter
Date Mon, 25 Jul 2011 20:35:13 GMT
Hi Robert,

In this specific case, that's OK.  I'll never write to the same file
from two different mappers.  Otherwise, think it's cool?  I haven't
played with the outputformat before.

Thanks,

Tom

On Mon, Jul 25, 2011 at 1:30 PM, Robert Evans <evans@yahoo-inc.com> wrote:
> Tom,
>
> That assumes that you will never write to the same file from two different mappers or
processes.  HDFS currently does not support writing to a single file from multiple processes.
>
> --Bobby
>
> On 7/25/11 3:25 PM, "Tom Melendez" <tom@supertom.com> wrote:
>
> Hi Folks,
>
> Just doing a sanity check here.
>
> I have a map-only job, which produces a filename for a key and data as
> a value.  I want to write the value (data) into the key (filename) in
> the path specified when I run the job.
>
> The value (data) doesn't need any formatting, I can just write it to
> HDFS without modification.
>
> So, looking at this link (the Output Formats section):
>
> http://developer.yahoo.com/hadoop/tutorial/module5.html
>
> Looks like I want to:
> - create a new output format
> - override write, tell it not to call writekey as I don't want that written
> - new getRecordWriter method that use the key as the filename and
> calls my outputformat
>
> Sound reasonable?
>
> Thanks,
>
> Tom
>
> --
> ===================
> Skybox is hiring.
> http://www.skyboximaging.com/careers/jobs
>
>



-- 
===================
Skybox is hiring.
http://www.skyboximaging.com/careers/jobs

Mime
View raw message