Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E742D8EF for ; Wed, 21 Nov 2012 14:43:17 +0000 (UTC) Received: (qmail 2152 invoked by uid 500); 21 Nov 2012 14:43:12 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 2066 invoked by uid 500); 21 Nov 2012 14:43:11 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 2050 invoked by uid 99); 21 Nov 2012 14:43:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2012 14:43:11 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bharathvissapragada1990@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pb0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2012 14:43:05 +0000 Received: by mail-pb0-f48.google.com with SMTP id rq13so5241599pbb.35 for ; Wed, 21 Nov 2012 06:42:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=C+xPYsSecQ/8+h2SsjLRkeNxeerBrxR4PYaLPjgyA4E=; b=hSiHv2Cc3RK8CAOVmdmFazXG+bWWPdvk+2lh3s1/be8UI3Yrf9+0i+Nes5CQqSWG3R dsmgUPG0MHqHQd1jtZvl2s1I3oa4uo5GkEye4dftNyv4cPruER8+h4c4Pir0jsQP5uSu vxtMjm7N77xusK1LLYyv0+aeR3aVKARSinStzp0E3EJx6ubNVBnoIoIP4efU5VujtQ66 upL//8K/FEv3YnYOVleYlr/NilMWCqX/O40wDnEJBi2p6qaDIaI0CPysnUHssYGjx/v+ T+vgpTB6Jtln9+6A6h9jWRGz39BuLSzlUevlJVkEOz2dq0E3507xYO3f/hu/nBdWhHWI MUtQ== Received: by 10.66.86.42 with SMTP id m10mr18071720paz.3.1353508964447; Wed, 21 Nov 2012 06:42:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.67.14.9 with HTTP; Wed, 21 Nov 2012 06:42:23 -0800 (PST) In-Reply-To: References: From: bharath vissapragada Date: Wed, 21 Nov 2012 20:12:23 +0530 Message-ID: Subject: Re: reducer not starting To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d042f972a06b52404cf025d1e X-Virus-Checked: Checked by ClamAV on apache.org --f46d042f972a06b52404cf025d1e Content-Type: text/plain; charset=ISO-8859-1 As harsh suggested, you might want to check the task logs on slaves (you can do it though web UI by clicking on map/reduce task links) and see if there are any exceptions . On Wed, Nov 21, 2012 at 8:06 PM, jamal sasha wrote: > Hi > Thanks for the insights. > I noticed that these restarts of mappers were because in the shebang i had > Usr/env/bin instead of usr/env/bin python > Any clue of what was going on with reducers not starting but mappers being > executed again and again. > Probably a very naive question but i am newbie you see :) > > > > On Wednesday, November 21, 2012, Jean-Marc Spaggiari < > jean-marc@spaggiari.org> wrote: > > Just FYI, you don't need to stop the job, update the host, and retry. > > > > Just update the host while the job is running and it should retry and > restart. > > > > I had a similar issue with one of my node where the hosts file were > > not updated. After the updated it has automatically resume the work... > > > > JM > > > > 2012/11/21, praveenesh kumar : > >> Sometimes its network issue, reducers are not able to find hostnames or > IPs > >> of the other machines. Make sure your /etc/hosts entries and hostnames > are > >> correct. > >> > >> Regards, > >> Praveenesh > >> > >> On Tue, Nov 20, 2012 at 10:46 PM, Harsh J wrote: > >> > >>> Your mappers are failing (possibly a user-side error or an > >>> environmental one) and are being reattempted by the framework (default > >>> behavior, attempts 4 times to avoid transient failure scenario). > >>> > >>> Visit your job's logs in the JobTracker web UI, to find more > >>> information on why your tasks fail. > >>> > >>> On Tue, Nov 20, 2012 at 10:22 PM, jamal sasha > >>> wrote: > >>> > > >>> > > >>> > > >>> > I am not sure whats happening, but I wrote a simple mapper and > reducer > >>> > script. > >>> > > >>> > > >>> > > >>> > And I am testing it against a small dataset (like few lines long). > >>> > > >>> > > >>> > > >>> > For some reason reducer is just not starting.. and mapper is > executing > >>> again > >>> > and again? > >>> > > >>> > > >>> > > >>> > 12/11/20 09:21:18 INFO streaming.StreamJob: map 0% reduce 0% > >>> > > >>> > 12/11/20 09:22:05 INFO streaming.StreamJob: map 50% reduce 0% > >>> > > >>> > 12/11/20 09:22:10 INFO streaming.StreamJob: map 100% reduce 0% > >>> > > >>> > 12/11/20 09:32:05 INFO streaming.StreamJob: map 50% reduce 0% > >>> > > >>> > 12/11/20 09:32:11 INFO streaming.StreamJob: map 0% reduce 0% > >>> > > >>> > 12/11/20 09:32:20 INFO streaming.StreamJob: map 50% reduce 0% > >>> > > >>> > 12/11/20 09:32:31 INFO streaming.StreamJob: map 100% reduce 0% > >>> > > >>> > 12/11/20 09:42:20 INFO streaming.StreamJob: map 50% reduce 0% > >>> > > >>> > 12/11/20 09:42:31 INFO streaming.StreamJob: map 0% reduce 0% > >>> > > >>> > 12/11/20 09:42:32 INFO streaming.StreamJob: map 50% reduce 0% > >>> > > >>> > 12/11/20 09:42:50 INFO streaming.StreamJob: map 100% reduce 0% > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > Let me know if you want the code also. > >>> > > >>> > Any clues of where I am going wrong? > >>> > > >>> > Thanks > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Harsh J > >>> > >> > > > -- Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v --f46d042f972a06b52404cf025d1e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable As harsh suggested, you might want to check the task logs on slaves (you ca= n do it though web UI by clicking on map/reduce task links) and see if ther= e are any exceptions .=A0


On Wed, Nov 21, 2012 at 8:06 PM, jamal sasha <jamalshasha@gmail.com> wrote:
Hi
=A0 Thanks for the insights.
I noticed that these restarts of ma= ppers were because in the shebang i had
Usr/env/bin instead of usr/env/b= in python
Any clue of what was going on with reducers not starting but m= appers being executed again and again.
Probably a very naive question but i am newbie you see :)



On Wednesday, November 21, 2012, Jean-Ma= rc Spaggiari <
jean-marc@spaggiari.org> wrote:
> Just FYI, you don't need to stop the job, update the host, and ret= ry.
>
> Just update the host while the job is running and it should re= try and restart.
>
> I had a similar issue with one of my node = where the hosts file were
> not updated. After the updated it has aut= omatically resume the work...
>
> JM
>
> 2012/11/21, praveenesh kumar <praveenesh@gmail.com>= ;:
>> Sometimes its network issue, reducers are not able to find h= ostnames or IPs
>> of the other machines. Make sure your /etc/hosts entries and hostn= ames are
>> correct.
>>
>> Regards,
>> = Praveenesh
>>
>> On Tue, Nov 20, 2012 at 10:46 PM, Harsh = J <harsh@clouder= a.com> wrote:
>>
>>> Your mappers are failing (possibly a user-side err= or or an
>>> environmental one) and are being reattempted by th= e framework (default
>>> behavior, attempts 4 times to avoid tr= ansient failure scenario).
>>>
>>> Visit your job's logs in the JobTracker we= b UI, to find more
>>> information on why your tasks fail.
&= gt;>>
>>> On Tue, Nov 20, 2012 at 10:22 PM, jamal sasha &= lt;jamalshasha@g= mail.com>
>>> wrote:
>>> >
>>> >
>>&g= t; >
>>> > I am not sure whats happening, but I wrote a s= imple mapper and reducer
>>> > script.
>>> ><= br> >>> >
>>> >
>>> > And I am testin= g it against a small dataset (like few lines long).
>>> >>>> >
>>> >
>>> > For some reaso= n reducer is just not starting.. and mapper is executing
>>> again
>>> > and again?
>>> >
= >>> >
>>> >
>>> > =A0 =A0 12/11/2= 0 09:21:18 INFO streaming.StreamJob: =A0map 0% =A0reduce 0%
>>>= >
>>> > =A0 =A0 12/11/20 09:22:05 INFO streaming.StreamJob: =A0ma= p 50% =A0reduce 0%
>>> >
>>> > =A0 =A0 12/11/= 20 09:22:10 INFO streaming.StreamJob: =A0map 100% =A0reduce 0%
>>&= gt; >
>>> > =A0 =A0 12/11/20 09:32:05 INFO streaming.StreamJob: =A0ma= p 50% =A0reduce 0%
>>> >
>>> > =A0 =A0 12/11/= 20 09:32:11 INFO streaming.StreamJob: =A0map 0% =A0reduce 0%
>>>= ; >
>>> > =A0 =A0 12/11/20 09:32:20 INFO streaming.Stream= Job: =A0map 50% =A0reduce 0%
>>> >
>>> > =A0 =A0 12/11/20 09:32:31 INFO strea= ming.StreamJob: =A0map 100% =A0reduce 0%
>>> >
>>&g= t; > =A0 =A0 12/11/20 09:42:20 INFO streaming.StreamJob: =A0map 50% =A0r= educe 0%
>>> >
>>> > =A0 =A0 12/11/20 09:42:31 INFO strea= ming.StreamJob: =A0map 0% =A0reduce 0%
>>> >
>>>= > =A0 =A0 12/11/20 09:42:32 INFO streaming.StreamJob: =A0map 50% =A0red= uce 0%
>>> >
>>> > =A0 =A0 12/11/20 09:42:50 INFO streaming.StreamJob: =A0ma= p 100% =A0reduce 0%
>>> >
>>> >
>>&g= t; >
>>> >
>>> >
>>> > Let = me know if you want the code also.
>>> >
>>> > Any clues of where I am going wrong?=
>>> >
>>> > Thanks
>>> >
&= gt;>> >
>>> >
>>> >
>>> = >
>>> >
>>>
>>>
>>>
>&g= t;> --
>>> Harsh J
>>>
>>
>



--
= Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v
--f46d042f972a06b52404cf025d1e--