hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: Static variable in reducer
Date Sun, 28 Jun 2015 15:08:40 GMT
You asked a similar question earlier also so I will copy those comments
here with what I replied then:
Basically, to summarize, you shouldn't incorporate common sharable state
among reducers. You need to rethink your design.

Moving on, if you still want to do this then in your scenario: If a reducer
fails (runs out of memory, hdd crashes etc. or in case of speculative
execution,) then it will be given a new attempt number with the old task id
when it is being recomputed/retried and your custom counter variable (the
static one) should be reinitialized (as it will be in a new JVM).


On Sun, Jun 28, 2015 at 6:45 AM, Ravikant Dindokar <ravikant.iisc@gmail.com>

> Hi Hadoop user,
> I have graph data file in the form of edge list
> <Source Vertex_id> <Sink Vertex_id>
> I want to assign each edge a unique ID. In the map function I emit
> (key,value) as (<Source Vertex_id>, <Sink Vertex_id>)
> In the reducer, for each value , I am using a combination of static count
> variable, and task id (context.getTaskAttemptID().getTaskID().getId()) to
> generate a unique ID.
> edgeId=(localcount <<16)|(taskId << 55);
> I am able to generate unique IDs.
> My question is if a reducer fails will this work?
> What exactly happens when a reducer fails and computed again?
> PFA source code for mapper & reducer.
> Thanks
> Ravikant

View raw message