hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Running map tasks after all reduces have finished
Date Thu, 23 Aug 2012 10:55:17 GMT
Thanks Jan. I'm moving this to cdh-user@cloudera.org
(http://groups.google.com/a/cloudera.org/forum/?fromgroups#!forum/cdh-user)
since it may be CDH3-specific.

Can you share your JobTracker log and a Job ID (That exhibited this
behavior) we can track?

On Thu, Aug 23, 2012 at 4:15 PM, Jan Lukavský
<jan.lukavsky@firma.seznam.cz> wrote:
> Hi,
>
> sorry I forgot to mention. We are using cdh3u3.
>
> Jan
>
>
> On 23.8.2012 12:08, Harsh J wrote:
>>
>> Hey Jan,
>>
>> What version/distribution of Hadoop are you noticing this on?
>>
>> On Thu, Aug 23, 2012 at 2:55 PM, Jan Lukavský
>> <jan.lukavsky@firma.seznam.cz> wrote:
>>>
>>> Hi all,
>>>
>>> we are seeing strange behaviour of JobTracker in the following scenario:
>>>   - job finishes map phase and starts reduce
>>>   - after the shuffle phase of all reducers we loose a tasktracker, that
>>> doesn't run any reducer - so all remaining reducers are still running in
>>> the
>>> reduce phase
>>>   - map tasks that were running on the lost tasktracker are rescheduled
>>>   - reduces may finish earlier than the rescheduled map tasks and so the
>>> job
>>> is blocked waiting for the maps to finish, although their output is
>>> simple
>>> discarded
>>>
>>> Is this behaviour a bug or feature? :) I haven't found any JIRA that
>>> would
>>> describe it, if there exists one can anyone point me out?
>>>
>>> Thanks,
>>>   Jan
>>>
>>
>>
>
>
> --
>
> Jan Lukavský
> programátor
> Seznam.cz, a.s.
> Radlická 608/2
> 15000, Praha 5
>
> jan.lukavsky@firma.seznam.cz
> http://www.seznam.cz
>



-- 
Harsh J

Mime
View raw message