hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: How to use the reduce result in next code part
Date Mon, 11 Jun 2012 03:22:23 GMT

Can your rowkey_list requiring logic not be implemented within a
single reduce(key, <List> values) call itself? If you require the
whole list before processing, and the whole lists may be small, then
collecting their cloned copies in memory is also one way out.

On Mon, Jun 11, 2012 at 8:39 AM, Liu, Keyan (NSN - CN/Beijing)
<keyan.liu@nsn.com> wrote:
> Hi All,
> I am using Mapreduce to scan HBase region to get the rowkey_list that
> related with one query.
> In Map period, each mapper outputs partial rowkey_list. In reduce period,
> the reducer will collect and sort all rowkey.
> If I need to use rowkey_list result of the reduce, how can transport the
> rowkey_list outside reduce?
> I have tried to write one reduce output to HDFS “/part-r-00000”, then read
> the result in HDFS, but I found the efficiency is too low.
> How can I use the reduce result in next code part? Is there one API or
> example that can be used?
> Thanks.
> Regards,
> William Liu

Harsh J

View raw message