Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2AB5710C4D for ; Mon, 9 Dec 2013 22:01:53 +0000 (UTC) Received: (qmail 825 invoked by uid 500); 9 Dec 2013 22:01:52 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 755 invoked by uid 500); 9 Dec 2013 22:01:52 -0000 Mailing-List: contact user-help@spark.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@spark.incubator.apache.org Delivered-To: mailing list user@spark.incubator.apache.org Received: (qmail 747 invoked by uid 99); 9 Dec 2013 22:01:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Dec 2013 22:01:52 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of fnothaft@berkeley.edu designates 209.85.192.180 as permitted sender) Received: from [209.85.192.180] (HELO mail-pd0-f180.google.com) (209.85.192.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Dec 2013 22:01:46 +0000 Received: by mail-pd0-f180.google.com with SMTP id q10so5961656pdj.25 for ; Mon, 09 Dec 2013 14:01:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from:date:cc :content-transfer-encoding:message-id:references:to; bh=6M0rXFnEo4j+HILJo54jqPpSwbXk1CPjH1v8tzHPBbk=; b=lzvq2Giv7KQMUSicaw0cc9JnsYjnw2Dl69h8uOY0MREApJCvC1kK+9a0Fe4Q3AhDKL 4qucct6SxScfY6SDlSjTkX0BcPM82RZ+Uv7WoRY56z6YA54/7QRipeC8HE82C3d9jvcM RIPkJzvQAMDn04ngEn529tFYZIja3MxKrqVXkCt5WY9ORkmDutlmvPqKvzCJdH3UrteJ dmnvLy90iNPOhBu0k4hFQdsxEipY4Kd9d//Vp43Y4qOqNW0agFtmK/lqohvazHpbflg5 /rpVwXaXlMgHlIv80s2yXWtsnsTv10qc9STrcrzcesv5b5Xfpj6t17SEXW83QcYe3yHE zaMg== X-Gm-Message-State: ALoCoQkPteUMUimGuf7SfeCqvQmce4IojE+0FjCui7UxR0x+SLGQj4SBtIRNsAh7oCw3u5T4HNLG X-Received: by 10.66.197.164 with SMTP id iv4mr23679433pac.18.1386626485330; Mon, 09 Dec 2013 14:01:25 -0800 (PST) Received: from dhcp-44-174.eecs.berkeley.edu (dhcp-44-174.EECS.Berkeley.EDU. [128.32.44.174]) by mx.google.com with ESMTPSA id e6sm20480958pbg.4.2013.12.09.14.01.23 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 09 Dec 2013 14:01:24 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Remote client shutdown error From: Frank Austin Nothaft Date: Mon, 9 Dec 2013 14:01:23 -0800 Cc: "jey@cs.berkeley.edu" Content-Transfer-Encoding: quoted-printable Message-Id: References: To: "user@spark.incubator.apache.org" X-Mailer: Apple Mail (2.1510) X-Virus-Checked: Checked by ClamAV on apache.org Hi all, I am getting a remote client shutdown error when running my application = using spark-0.8.0-incubating on EC2; Jey and I have been looking at this = most of today. My cluster is set up using the spark-ec2 scripts. We are = running two applications, one which binds successfully to the Spark = master and runs. The second application does not bind to the master, and = throws the following error: 2013-12-09 21:40:09 ERROR Client$ClientActor:64 - Connection to master = failed; stopping client 2013-12-09 21:40:09 ERROR SparkDeploySchedulerBackend:64 - Disconnected = from Spark cluster! 2013-12-09 21:40:09 ERROR ClusterScheduler:64 - Exiting due to error = from cluster scheduler: Disconnected from Spark cluster Even after this failure, I am still able to run the first command again. = In the application, we've put logging to debug and get no additional = information on the cause of the error. We also tried setting the Spark = master logging level to debug, but do not see any increase in logging. Through jdb, we traced this to = org.apache.spark.deploy.client.Client$ClientActor.receive, where the = RemoteClientShutdown case matches, and markDisconnected is called. = However, we have not been able to debug further. Any advice would be = greatly appreciated. Regards, Frank Austin Nothaft fnothaft@berkeley.edu fnothaft@eecs.berkeley.edu 202-340-0466