hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Can I number output results with a Counter?
Date Fri, 20 May 2011 17:01:26 GMT
To make sure I understand you correctly, you need a globally unique
one up counter for each output record?

If you had an upper bound on the number of records a single reducer
could output and you can afford to have gaps, you could just use the
task id and multiply that by the max number of records and then one up
from there.

If that doesn't work for you, then you'll need to use some kind of
central service for allocating numbers which could become a
bottleneck.

-Joey

On Fri, May 20, 2011 at 9:55 AM, Mark Kerzner <markkerzner@gmail.com> wrote:
> Hi, can I use a Counter to give each record in all reducers a consecutive
> number? Currently I am using a single Reducer, but it is an anti-pattern.
> But I need to assign consecutive numbers to all output records in all
> reducers, and it does not matter how, as long as each gets its own number.
>
> If it IS possible, then how are multiple processes accessing those counters
> without creating race conditions.
>
> Thank you,
>
> Mark
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message