Return-Path: Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: (qmail 16102 invoked from network); 24 Nov 2009 09:13:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Nov 2009 09:13:57 -0000 Received: (qmail 5010 invoked by uid 500); 24 Nov 2009 09:13:57 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 4937 invoked by uid 500); 24 Nov 2009 09:13:57 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 4928 invoked by uid 99); 24 Nov 2009 09:13:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 09:13:57 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE,NO_RDNS_DOTCOM_HELO X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.145.54.172] (HELO mrout2.yahoo.com) (216.145.54.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 09:13:53 +0000 Received: from SP1-EX07CAS02.ds.corp.yahoo.com (sp1-ex07cas02.ds.corp.yahoo.com [216.252.116.138]) by mrout2.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id nAO9DC8E062335 for ; Tue, 24 Nov 2009 01:13:12 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=received:from:to:date:subject:thread-topic:thread-index: message-id:in-reply-to:accept-language:content-language: x-ms-has-attach:x-ms-tnef-correlator:acceptlanguage:content-type:mime-version; b=OTSD+O53zfwIsTcp+PBNNuZ/omQkdCvcMK1Owvr12NZSuqJ7XQXSDcEyn3XZRppY Received: from SP1-EX07VS01.ds.corp.yahoo.com ([216.252.116.139]) by SP1-EX07CAS02.ds.corp.yahoo.com ([216.252.116.167]) with mapi; Tue, 24 Nov 2009 01:11:08 -0800 From: Rekha Joshi To: "mapreduce-user@hadoop.apache.org" Date: Tue, 24 Nov 2009 01:11:01 -0800 Subject: Re: Maps getting stuck at 100% Thread-Topic: Maps getting stuck at 100% Thread-Index: Acps41yAdJ1Q+tVlTmmcUgu6w3tRxgAAq4fk Message-ID: In-Reply-To: <508988.37863.qm@web38404.mail.mud.yahoo.com> Accept-Language: en-US Content-Language: en X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_C731A0FD4865rekhajosyahooinccom_" MIME-Version: 1.0 --_000_C731A0FD4865rekhajosyahooinccom_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Even if code is the same, if the data it processes has changed (for eg: dat= e related data), or the parameters are different(for eg:sort/spill on map),= the change in behavior can occur. Seems to me related to buffering concern.The detailed logs can point out wh= at exactly is happening. Thanks & Regards, /R On 11/24/09 2:18 PM, "himanshu chandola" wrote= : Hi Todd, It was definitely working fine a week before and the code hasn't changed mu= ch. On my laptop a pseudo distributed installation for the same code finish= es successive map reduce iteration quickly enough. As far as I can see it, it is probably due to reformatting the fs. But I ca= n't understand why it occurs this way. tx Himanshu Morpheus: Do you believe in fate, Neo? Neo: No. Morpheus: Why Not? Neo: Because I don't like the idea that I'm not in control of my life. ________________________________ From: Todd Lipcon To: mapreduce-user@hadoop.apache.org Sent: Tue, November 24, 2009 2:52:51 AM Subject: Re: Maps getting stuck at 100% Hi Himanshu, The map progress percentage is calculated based on the input read, rather t= han the processing actually done. So, if you're doing a lot of work in your= mapper, or reading ahead of what you've processed, you'll see this behavio= r reasonably often. It also can show up sometimes in streaming jobs if you = are doing a lot of work per row, since have more buffering going on between= the counters and your actual mapper work. The easiest way to see what the tasks are doing is to drill down to the log= s for an individual task that's stuck at 100%. If you add some logging outp= ut to your program, that can be helpful. Another trick, if you have the rig= ht access, is to ssh into your tasktracker node and send the SIGQUIT signal= to one of your task pids - this will make it dump stack to its stdout log,= which you can then inspect to understand what's going on. Hope that helps -Todd On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola wrote: Hi, I use cloudera's distribution for hadoop. What I see is that a small fracti= on of maps get stuck at 100%. They show up as 100% but continue running. Af= ter a lot of delay, they succeed finally but it takes a while, like 10 mins= from the time when they show up as 100%. We recently reformatted our hadoop fs. Could it be related to that ? Thanks Morpheus: Do you believe in fate, Neo? Neo: No. Morpheus: Why Not? Neo: Because I don't like the idea that I'm not in control of my life. --_000_C731A0FD4865rekhajosyahooinccom_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: Maps getting stuck at 100% Even if code is the same, if the data it processes has changed (for e= g: date related data), or the parameters are different(for eg:sort/spill on= map), the change in behavior can occur.
Seems to me related to buffering concern.The detailed logs can point out wh= at exactly is happening.

Thanks & Regards,
/R


On 11/24/09 2:18 PM, "himanshu chandola" <himanshu_coolguy@yahoo.com> wrote:

<= SPAN STYLE=3D'font-size:10pt'>Hi Todd,
It was definitely working fine a week before and the code hasn't changed mu= ch. On my laptop a pseudo distributed installation for the same code finish= es successive map reduce iteration quickly enough.

As far as I can see it, it is probably due to reformatting the fs. But I ca= n't understand why it occurs this way.

tx

Himanshu
 
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.



F= rom: Todd Lipcon <todd@cloudera.com>
To: mapreduce-user@hado= op.apache.org
Sent: Tue, November 24, 2009 2:52:51 AM
Subject: Re: Maps getting stuck at 100%

Hi Himanshu,

The map progress percentage is calculated based on the input read, rather t= han the processing actually done. So, if you're doing a lot of work in your= mapper, or reading ahead of what you've processed, you'll see this behavio= r reasonably often. It also can show up sometimes in streaming jobs if you = are doing a lot of work per row, since have more buffering going on between= the counters and your actual mapper work.

The easiest way to see what the tasks are doing is to drill down to the log= s for an individual task that's stuck at 100%. If you add some logging outp= ut to your program, that can be helpful. Another trick, if you have the rig= ht access, is to ssh into your tasktracker node and send the SIGQUIT signal= to one of your task pids - this will make it dump stack to its stdout log,= which you can then inspect to understand what's going on.

Hope that helps
-Todd

On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola <himanshu_coolguy@yahoo.com> wrote:
Hi,
I use cloudera's distribution for hadoop. What I see is that a small fracti= on of maps get stuck at 100%. They show up as 100% but continue running. Af= ter a lot of delay, they succeed finally but it takes a while, like 10 mins= from the time when they show up as 100%.

We recently reformatted our hadoop fs. Could it be related to that ?


Thanks




 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.






 
--_000_C731A0FD4865rekhajosyahooinccom_--