hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: Custom FileOutputFormat / RecordWriter
Date Mon, 25 Jul 2011 20:30:41 GMT
Tom,

That assumes that you will never write to the same file from two different mappers or processes.
 HDFS currently does not support writing to a single file from multiple processes.

--Bobby

On 7/25/11 3:25 PM, "Tom Melendez" <tom@supertom.com> wrote:

Hi Folks,

Just doing a sanity check here.

I have a map-only job, which produces a filename for a key and data as
a value.  I want to write the value (data) into the key (filename) in
the path specified when I run the job.

The value (data) doesn't need any formatting, I can just write it to
HDFS without modification.

So, looking at this link (the Output Formats section):

http://developer.yahoo.com/hadoop/tutorial/module5.html

Looks like I want to:
- create a new output format
- override write, tell it not to call writekey as I don't want that written
- new getRecordWriter method that use the key as the filename and
calls my outputformat

Sound reasonable?

Thanks,

Tom

--
===================
Skybox is hiring.
http://www.skyboximaging.com/careers/jobs


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message