hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: questions regarding data storage and inputformat
Date Wed, 27 Jul 2011 20:40:39 GMT
You could either use a custom RecordReader or you could override the
run() method on your Mapper class to do the merging before calling the
map() method.

-Joey

On Wed, Jul 27, 2011 at 11:09 AM, Tom Melendez <tom@supertom.com> wrote:
>>
>>> 3. Another idea might be create separate seq files for chunk of
>>> records and make them non-splittable, ensuring that they go to a
>>> single mapper.  Assuming I can get away with this, see any pros/cons
>>> with that approach?
>>
>> Separate sequence files would require the least amount of custom code.
>>
>
> Thanks for the response, Joey.
>
> So, if I were to do the above, I would still need a custom record
> reader to put all the keys and values together, right?
>
> Thanks,
>
> Tom
>
> --
> ===================
> Skybox is hiring.
> http://www.skyboximaging.com/careers/jobs
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message