Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: mapreduce-user@hadoop.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
	h=received:from:to:date:subject:thread-topic:thread-index:
	message-id:in-reply-to:accept-language:content-language:
	x-ms-has-attach:x-ms-tnef-correlator:acceptlanguage:content-type:mime-version;
	b=OTSD+O53zfwIsTcp+PBNNuZ/omQkdCvcMK1Owvr12NZSuqJ7XQXSDcEyn3XZRppY
From: Rekha Joshi <rekhajos@yahoo-inc.com>
To: "mapreduce-user@hadoop.apache.org" <mapreduce-user@hadoop.apache.org>
Date: Tue, 24 Nov 2009 01:11:01 -0800
Subject: Re: Maps getting stuck at 100%
Thread-Topic: Maps getting stuck at 100%
Thread-Index: Acps41yAdJ1Q+tVlTmmcUgu6w3tRxgAAq4fk
Message-ID: <C731A0FD.4865%rekhajos@yahoo-inc.com>
In-Reply-To: <508988.37863.qm@web38404.mail.mud.yahoo.com>
Accept-Language: en-US
Content-Language: en
acceptlanguage: en-US
Content-Type: multipart/alternative;
	boundary="_000_C731A0FD4865rekhajosyahooinccom_"
MIME-Version: 1.0

--_000_C731A0FD4865rekhajosyahooinccom_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Even if code is the same, if the data it processes has changed (for eg: dat=
e related data), or the parameters are different(for eg:sort/spill on map),=
 the change in behavior can occur.
Seems to me related to buffering concern.The detailed logs can point out wh=
at exactly is happening.

Thanks & Regards,
/R


On 11/24/09 2:18 PM, "himanshu chandola" <himanshu_coolguy@yahoo.com> wrote=
:

Hi Todd,
It was definitely working fine a week before and the code hasn't changed mu=
ch. On my laptop a pseudo distributed installation for the same code finish=
es successive map reduce iteration quickly enough.

As far as I can see it, it is probably due to reformatting the fs. But I ca=
n't understand why it occurs this way.

tx

Himanshu

Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.


________________________________
From: Todd Lipcon <todd@cloudera.com>
To: mapreduce-user@hadoop.apache.org
Sent: Tue, November 24, 2009 2:52:51 AM
Subject: Re: Maps getting stuck at 100%

Hi Himanshu,

The map progress percentage is calculated based on the input read, rather t=
han the processing actually done. So, if you're doing a lot of work in your=
 mapper, or reading ahead of what you've processed, you'll see this behavio=
r reasonably often. It also can show up sometimes in streaming jobs if you =
are doing a lot of work per row, since have more buffering going on between=
 the counters and your actual mapper work.

The easiest way to see what the tasks are doing is to drill down to the log=
s for an individual task that's stuck at 100%. If you add some logging outp=
ut to your program, that can be helpful. Another trick, if you have the rig=
ht access, is to ssh into your tasktracker node and send the SIGQUIT signal=
 to one of your task pids - this will make it dump stack to its stdout log,=
 which you can then inspect to understand what's going on.

Hope that helps
-Todd

On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola <himanshu_coolguy@yahoo=
.com> wrote:
Hi,
I use cloudera's distribution for hadoop. What I see is that a small fracti=
on of maps get stuck at 100%. They show up as 100% but continue running. Af=
ter a lot of delay, they succeed finally but it takes a while, like 10 mins=
 from the time when they show up as 100%.

We recently reformatted our hadoop fs. Could it be related to that ?


Thanks


 Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.


--_000_C731A0FD4865rekhajosyahooinccom_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<HTML>
<HEAD>
<TITLE>Re: Maps getting stuck at 100%</TITLE>
</HEAD>
<BODY>
<FONT FACE=3D"Calibri, Verdana, Helvetica, Arial"><SPAN STYLE=3D'font-size:=
11pt'>Even if code is the same, if the data it processes has changed (for e=
g: date related data), or the parameters are different(for eg:sort/spill on=
 map), the change in behavior can occur.<BR>
Seems to me related to buffering concern.The detailed logs can point out wh=
at exactly is happening.<BR>
<BR>
Thanks &amp; Regards,<BR>
/R<BR>
<BR>
<BR>
On 11/24/09 2:18 PM, &quot;himanshu chandola&quot; &lt;<a href=3D"himanshu_=
coolguy@yahoo.com">himanshu_coolguy@yahoo.com</a>&gt; wrote:<BR>
<BR>
</SPAN></FONT><BLOCKQUOTE><FONT SIZE=3D"2"><FONT FACE=3D"Times New Roman"><=
SPAN STYLE=3D'font-size:10pt'>Hi Todd,<BR>
It was definitely working fine a week before and the code hasn't changed mu=
ch. On my laptop a pseudo distributed installation for the same code finish=
es successive map reduce iteration quickly enough.<BR>
<BR>
As far as I can see it, it is probably due to reformatting the fs. But I ca=
n't understand why it occurs this way.<BR>
<BR>
tx<BR>
<BR>
Himanshu<BR>
&nbsp;<BR>
Morpheus: Do you believe in fate, Neo?<BR>
Neo: No.<BR>
Morpheus: Why Not?<BR>
Neo: Because I don't like the idea that I'm not in control of my life.<BR>
<BR>
<BR>
</SPAN></FONT></FONT><FONT FACE=3D"Tahoma, Verdana, Helvetica, Arial"><SPAN=
 STYLE=3D'font-size:11pt'><HR ALIGN=3DCENTER SIZE=3D"1" WIDTH=3D"100%"><B>F=
rom:</B></SPAN><SPAN STYLE=3D'font-size:12pt'> Todd Lipcon &lt;<a href=3D"t=
odd@cloudera.com">todd@cloudera.com</a>&gt;<BR>
<B>To:</B> <a href=3D"mapreduce-user@hadoop.apache.org">mapreduce-user@hado=
op.apache.org</a><BR>
<B>Sent:</B> Tue, November 24, 2009 2:52:51 AM<BR>
<B>Subject:</B> Re: Maps getting stuck at 100%<BR>
</SPAN></FONT><SPAN STYLE=3D'font-size:12pt'><FONT FACE=3D"Times New Roman"=
><BR>
Hi Himanshu,<BR>
<BR>
The map progress percentage is calculated based on the input read, rather t=
han the processing actually done. So, if you're doing a lot of work in your=
 mapper, or reading ahead of what you've processed, you'll see this behavio=
r reasonably often. It also can show up sometimes in streaming jobs if you =
are doing a lot of work per row, since have more buffering going on between=
 the counters and your actual mapper work.<BR>
<BR>
The easiest way to see what the tasks are doing is to drill down to the log=
s for an individual task that's stuck at 100%. If you add some logging outp=
ut to your program, that can be helpful. Another trick, if you have the rig=
ht access, is to ssh into your tasktracker node and send the SIGQUIT signal=
 to one of your task pids - this will make it dump stack to its stdout log,=
 which you can then inspect to understand what's going on.<BR>
<BR>
Hope that helps<BR>
-Todd<BR>
<BR>
On Mon, Nov 23, 2009 at 11:48 PM, himanshu chandola &lt;<a href=3D"himanshu=
_coolguy@yahoo.com">himanshu_coolguy@yahoo.com</a>&gt; wrote:<BR>
</FONT></SPAN><BLOCKQUOTE><SPAN STYLE=3D'font-size:12pt'><FONT FACE=3D"Time=
s New Roman">Hi,<BR>
I use cloudera's distribution for hadoop. What I see is that a small fracti=
on of maps get stuck at 100%. They show up as 100% but continue running. Af=
ter a lot of delay, they succeed finally but it takes a while, like 10 mins=
 from the time when they show up as 100%.<BR>
<BR>
We recently reformatted our hadoop fs. Could it be related to that ?<BR>
<BR>
<BR>
Thanks<BR>
<BR>
<BR>
<BR>
<BR>
&nbsp;Morpheus: Do you believe in fate, Neo?<BR>
Neo: No.<BR>
Morpheus: Why Not?<BR>
Neo: Because I don't like the idea that I'm not in control of my life.<BR>
<BR>
<BR>
<BR>
<BR>
</FONT></SPAN></BLOCKQUOTE><SPAN STYLE=3D'font-size:12pt'><FONT FACE=3D"Tim=
es New Roman"><BR>
</FONT></SPAN><FONT FACE=3D"Calibri, Verdana, Helvetica, Arial"><SPAN STYLE=
=3D'font-size:11pt'><BR>
&nbsp;<BR>
</SPAN></FONT></BLOCKQUOTE>
</BODY>
</HTML>


--_000_C731A0FD4865rekhajosyahooinccom_--