hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Melendez <...@supertom.com>
Subject Re: Custom FileOutputFormat / RecordWriter
Date Mon, 25 Jul 2011 22:06:45 GMT
Hi Harsh,

Thanks for the response.  Unfortunately, I'm not following your response.  :-)

Could you elaborate a bit?

Thanks,

Tom

On Mon, Jul 25, 2011 at 2:10 PM, Harsh J <harsh@cloudera.com> wrote:
> You can use MultipleOutputs (or MultiTextOutputFormat for direct
> key-file mapping, but I'd still prefer the stable MultipleOutputs).
> Your sinking Key can be of NullWritable type, and you can keep passing
> an instance of NullWritable.get() to it in every cycle. This would
> write just the value, while the filenames are added/sourced from the
> key inside the mapper code.
>
> This, if you are not comfortable writing your own code and maintaining
> it, I s'pose. Your approach is correct as well, if the question was
> specifically that.
>
> On Tue, Jul 26, 2011 at 1:55 AM, Tom Melendez <tom@supertom.com> wrote:
>> Hi Folks,
>>
>> Just doing a sanity check here.
>>
>> I have a map-only job, which produces a filename for a key and data as
>> a value.  I want to write the value (data) into the key (filename) in
>> the path specified when I run the job.
>>
>> The value (data) doesn't need any formatting, I can just write it to
>> HDFS without modification.
>>
>> So, looking at this link (the Output Formats section):
>>
>> http://developer.yahoo.com/hadoop/tutorial/module5.html
>>
>> Looks like I want to:
>> - create a new output format
>> - override write, tell it not to call writekey as I don't want that written
>> - new getRecordWriter method that use the key as the filename and
>> calls my outputformat
>>
>> Sound reasonable?
>>
>> Thanks,
>>
>> Tom
>>
>> --
>> ===================
>> Skybox is hiring.
>> http://www.skyboximaging.com/careers/jobs
>>
>
>
>
> --
> Harsh J
>



-- 
===================
Skybox is hiring.
http://www.skyboximaging.com/careers/jobs

Mime
View raw message