hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: Static variable in reducer
Date Sun, 28 Jun 2015 15:08:40 GMT
You asked a similar question earlier also so I will copy those comments
here with what I replied then:
http://hadoop-common.472056.n3.nabble.com/how-to-assign-unique-ID-Long-Value-in-mapper-td4078062.html
Basically, to summarize, you shouldn't incorporate common sharable state
among reducers. You need to rethink your design.

Moving on, if you still want to do this then in your scenario: If a reducer
fails (runs out of memory, hdd crashes etc. or in case of speculative
execution,) then it will be given a new attempt number with the old task id
when it is being recomputed/retried and your custom counter variable (the
static one) should be reinitialized (as it will be in a new JVM).

Regards,
Shahab

On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <ravikant.iisc@gmail.com>
wrote:

> Hi Hadoop user,
>
> I have graph data file in the form of edge list
> <Source Vertex_id> <Sink Vertex_id>
>
> I want to assign each edge a unique ID. In the map function I emit
> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
>
> In the reducer, for each value , I am using a combination of static count
> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
> generate a unique ID.
>
> edgeId=(localcount <<16)|(taskId << 55);
>
> I am able to generate unique IDs.
>
> My question is if a reducer fails will this work?
>
> What exactly happens when a reducer fails and computed again?
>
> PFA source code for mapper & reducer.
>
> Thanks
> Ravikant
>

Mime
View raw message