Return-Path: X-Original-To: apmail-bigtop-user-archive@www.apache.org Delivered-To: apmail-bigtop-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EE3B0DA82 for ; Thu, 16 May 2013 17:53:36 +0000 (UTC) Received: (qmail 59727 invoked by uid 500); 16 May 2013 17:53:36 -0000 Delivered-To: apmail-bigtop-user-archive@bigtop.apache.org Received: (qmail 59607 invoked by uid 500); 16 May 2013 17:53:36 -0000 Mailing-List: contact user-help@bigtop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@bigtop.apache.org Delivered-To: mailing list user@bigtop.apache.org Received: (qmail 59598 invoked by uid 99); 16 May 2013 17:53:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 May 2013 17:53:36 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mackrorysd@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qa0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 May 2013 17:53:30 +0000 Received: by mail-qa0-f48.google.com with SMTP id i13so1886784qae.0 for ; Thu, 16 May 2013 10:53:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=5fB66We5JHPLpti74ZselRC3bxG9DwOy/2Tq8aTY8Ys=; b=xMpbFOymMqPNxdKWtrqKmLpZOZJDd5SxiWupj9NlXf+8TdZOjWcTDglBcwgnKj1xJp 1biAW5eMbfrx22wFqg0kRizK/2NS+fZJyBbkQAW7gKfap3WmZChihwAjxkhSXcsGgqPJ 0kcWlSoKyl5OlPOPEjYLsd34Ge0trNFQypIiicJ/xnK6q0IDOWAR50RoDGedH40WUoRB Q1l/ukJWQH0PuT0mK+t+zEXzN3idCH1VCjokS1sOt2u/0yKlDs/Cd176DnV8orLg9x9X SjIIi9WicO9mdEWkTam4nKAxaFyQuBooQ5ryhnQ6RkQmviwLIO0+NjH0QTpqzJF3PwJx qGxQ== X-Received: by 10.224.80.70 with SMTP id s6mr33976539qak.27.1368726789285; Thu, 16 May 2013 10:53:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.49.129.7 with HTTP; Thu, 16 May 2013 10:52:48 -0700 (PDT) In-Reply-To: References: From: Sean Mackrory Date: Thu, 16 May 2013 10:52:48 -0700 Message-ID: Subject: Re: Bigtop: Invalid shuffle port number -1 returned To: user@bigtop.apache.org Content-Type: multipart/alternative; boundary=001a11c2d83e12080504dcd98a04 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2d83e12080504dcd98a04 Content-Type: text/plain; charset=ISO-8859-1 Hi Vaughn, The issue you're running into has been reported before ( https://issues.apache.org/jira/browse/BIGTOP-764) but has not been solved yet - so any additional information you can provide about your setup would be helpful in tracking down the root cause. When I encountered the problem, I had to restart the services a couple of times, but after a successful start up I never saw the problem again on that cluster. On Thu, May 16, 2013 at 7:01 AM, Vaughn E Clinton < Vaughn.E.Clinton@raytheon.com> wrote: > In an attempt to build a cluster solution from my big top .5 installation, > I'm running into the following stack dump every time I start the > nodemanager of a slave node. > If I stop the nodemanager, the test completes successfully. > > Anyway, has anyone seem a really detailed document about clustering with > Bigtop .5 and if so, can you point me to the site. > > One of the attempts stack dump: > > 13/05/16 08:54:03 INFO mapreduce.Job: Task Id : > attempt_1368710898922_0007_m_000008_0, Status : FAILED > Container launch failed for container_1368710898922_0007_01_000010 : > java.lang.IllegalStateException: Invalid shuffle port number -1 returned > for attempt_1368710898922_0007_m_000008_0 > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:168) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:390) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Vaughn > --001a11c2d83e12080504dcd98a04 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Vaughn,

The issue you're running into has been reported befor= e (https://iss= ues.apache.org/jira/browse/BIGTOP-764) but has not been solved yet - so= any additional information you can provide about your setup would be helpf= ul in tracking down the root cause. When I encountered the problem, I had t= o restart the services a couple of times, but after a successful start up I= never saw the problem again on that cluster.



On Thu, May 16, 2013 at 7:01 AM, Vau= ghn E Clinton <Vaughn.E.Clinton@raytheon.com> wr= ote:

In an attempt to build a cluster solution from= my big top .5 installation, I'm running into the following stack dump = every time I start the nodemanager of a slave node. =A0
If I stop the nodemanager, the test completes suc= cessfully.

Anyway, has anyone seem a really detailed documen= t about clustering with Bigtop .5 and if so, can you point me to the site.<= /font>

One of the attempts stack dump:

13/05/16 08:54:03 INFO mapreduce.Job: Task Id : attempt_1368710898922_0007_= m_000008_0, Status : FAILED

Container launch failed for container_13687108989= 22_0007_01_000010 : java.lang.IllegalStateException: Invalid shuffle port n= umber -1 returned for attempt_1368710898922_0007_m_000008_0
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapreduce.v2= .app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.= java:168)
=A0 =A0 =A0 =A0 at org.apache.hadoop.mapreduce.v2= .app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImp= l.java:390)
=A0 =A0 =A0 =A0 at java.util.concurrent.ThreadPoo= lExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
=A0 =A0 =A0 =A0 at java.util.concurrent.ThreadPoo= lExecutor$Worker.run(ThreadPoolExecutor.java:908)
=A0 =A0 =A0 =A0 at java.lang.Thread.run(Thread.ja= va:662)

Vaughn


--001a11c2d83e12080504dcd98a04--