hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From himanshu chandola <himanshu_cool...@yahoo.com>
Subject Re: Maps getting stuck at 100%
Date Tue, 24 Nov 2009 09:36:43 GMT
The data is still the same.

I will check on logs and see if I can find something.

H

 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.




________________________________
From: Rekha Joshi <rekhajos@yahoo-inc.com>
To: "mapreduce-user@hadoop.apache.org" <mapreduce-user@hadoop.apache.org>
Sent: Tue, November 24, 2009 4:11:01 AM
Subject: Re: Maps getting stuck at 100%

Re: Maps getting stuck at 100% Even if code is the same, if the data it processes has changed
(for eg: date related data), or the parameters are different(for eg:sort/spill on map), the
change in behavior can occur.
Seems to me related to buffering concern.The detailed logs can point out what exactly is happening.

Thanks & Regards,
/R


On 11/24/09 2:18 PM, "himanshu chandola" <himanshu_coolguy@yahoo.com> wrote:


Hi Todd,
>>It was definitely working fine a week before and the code hasn't changed much. On
my laptop a pseudo distributed installation for the same code finishes successive map reduce
iteration quickly enough.
>
>>As far as I can see it, it is probably due to reformatting the fs. But I can't understand
why it occurs this way.
>
>>tx
>
>>Himanshu
>> 
>>Morpheus: Do you believe in fate, Neo?
>>Neo: No.
>>Morpheus: Why Not?
>>Neo: Because I don't like the idea that I'm not in control of my life.
>
>
>
________________________________
From:Todd Lipcon <todd@cloudera.com>
>To: mapreduce-user@hadoop.apache.org
>Sent: Tue, November 24, 2009 2:52:51 AM
>Subject: Re: Maps getting stuck at 100%
>
>>Hi Himanshu,
>
>>The map progress percentage is calculated based on the input read, rather than the
processing actually done. So, if you're doing a lot of work in your mapper, or reading ahead
of what you've processed, you'll see this behavior reasonably often. It also can show up sometimes
in streaming jobs if you are doing a lot of work per row, since have more buffering going
on between the counters and your actual mapper work.
>
>>The easiest way to see what the tasks are doing is to drill down to the logs for an
individual task that's stuck at 100%. If you add some logging output to your program, that
can be helpful. Another trick, if you have the right access, is to ssh into your tasktracker
node and send the SIGQUIT signal to one of your task pids - this will make it dump stack to
its stdout log, which you can then inspect to understand what's going on.
>
>>Hope that helps
>>-Todd
>
>>On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola <himanshu_coolguy@yahoo.com>
wrote:
>
>Hi,
>>>>I use cloudera's distribution for hadoop. What I see is that a small fraction
of maps get stuck at 100%. They show up as 100% but continue running. After a lot of delay,
they succeed finally but it takes a while, like 10 mins from the time when they show up as
100%.
>>
>>>>We recently reformatted our hadoop fs. Could it be related to that ?
>>
>>
>>>>Thanks
>>
>>
>>
>>
>>>> Morpheus: Do you believe in fate, Neo?
>>>>Neo: No.
>>>>Morpheus: Why Not?
>>>>Neo: Because I don't like the idea that I'm not in control of my life.
>>
>>
>>
>>
>>
>
>> 
>


      
Mime
View raw message