Return-Path: X-Original-To: apmail-incubator-mesos-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-mesos-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 37C9010B0F for ; Mon, 15 Apr 2013 18:26:14 +0000 (UTC) Received: (qmail 8286 invoked by uid 500); 15 Apr 2013 18:26:14 -0000 Delivered-To: apmail-incubator-mesos-dev-archive@incubator.apache.org Received: (qmail 8262 invoked by uid 500); 15 Apr 2013 18:26:14 -0000 Mailing-List: contact mesos-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mesos-dev@incubator.apache.org Delivered-To: mailing list mesos-dev@incubator.apache.org Received: (qmail 8254 invoked by uid 99); 15 Apr 2013 18:26:14 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Apr 2013 18:26:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,T_FRT_BELOW2,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of eduardocalfaia@gmail.com designates 74.125.83.50 as permitted sender) Received: from [74.125.83.50] (HELO mail-ee0-f50.google.com) (74.125.83.50) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Apr 2013 18:26:09 +0000 Received: by mail-ee0-f50.google.com with SMTP id e53so2437630eek.37 for ; Mon, 15 Apr 2013 11:25:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=FaYHnIbUec8NHm9zhMfV9fLQPDrZNfi8ev0FIwuwJpg=; b=PmtA/AkuGEGQYDnckTRQh952nlYJF6vWnu1NxiYPA5eS13VkgwV4rO/vD8IuwInsAN YNIH4nFIwPoJu+esq4MPtePxB3OTqBGOI287dPb82q/F4IBTP7YEexWxUZ0WRKvseiTk s7uBiLj5YF+3jF50zVtDEYdXr+shtxNSvIKLwOlD0j8yVz3yotjj4rGuFs75ElfkM9Me 5PhF9kplRzKyEGeA6ocA6MVtWi5TKlolZwMQWPOFExXuzHukEUX0Rz3XZVQqpI1ZBJYU tKBLlGG179e5Aw3kPpE8rWlUPbJhYYdVwV9h3aGJE0DipCRQkHyHMjatb1GssuLXC6Qr tAfQ== MIME-Version: 1.0 X-Received: by 10.15.111.202 with SMTP id cj50mr63973807eeb.6.1366050348572; Mon, 15 Apr 2013 11:25:48 -0700 (PDT) Received: by 10.223.141.69 with HTTP; Mon, 15 Apr 2013 11:25:48 -0700 (PDT) In-Reply-To: References: Date: Mon, 15 Apr 2013 14:25:48 -0400 Message-ID: Subject: Re: Slave Removedo From: Eduardo Alfaia To: mesos-dev@incubator.apache.org Content-Type: multipart/alternative; boundary=089e01635502c5c44c04da6a61f2 X-Virus-Checked: Checked by ClamAV on apache.org --089e01635502c5c44c04da6a61f2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Guys, the Private IP instead the FQDN is working, however I had had to change the /etc/hosts thanks 2013/4/15 Benjamin Mahler > Can you try using the private IP instead? You can find it using ifconfig. > > > On Mon, Apr 15, 2013 at 10:33 AM, Eduardo Alfaia > wrote: > > > Hi Vinod, thanks by your fast replay > > > > I'm not using EC2 but I'm using the name of server like, for example > > blockmon1.ing.unibs.it. Could be this? > > > > I'm using 3 nodes ( 1 Master and 2 Slaves) > > > > Regards > > > > > > 2013/4/15 Vinod Kone > > > > > Hi Eduardo, > > > > > > This looks like a networking issue. What is your cluster setup like? > > > > > > Are you running on Amazon EC2? We have seen similar behavior before > when > > > users were running Mesos on EC2. If I remember correctly, the fix was > to > > to > > > use private ip addresses for master and slaves, instead of "localhost= " > or > > > "public ip". > > > > > > @vinodkone > > > > > > > > > On Mon, Apr 15, 2013 at 10:13 AM, Eduardo Alfaia < > > eduardocalfaia@gmail.com > > > > > > > wrote: > > > > > > > Hi Guys, > > > > I am newer in Mesos and I am having some problems when running the > > launch > > > > mesos scripts bellow. Why does the master remove the slave? I have > seen > > > > something about checkpoint. > > > > > > > > MASTER > > > > root@blockmon1:/opt/mesos-trunk/build/bin# ./mesos-master.sh > > > > I0415 18:00:47.543422 17720 main.cpp:116] Build: 2013-04-14 23:48:5= 1 > by > > > > root > > > > I0415 18:00:47.543926 17720 main.cpp:117] Starting Mesos master > > > > I0415 18:00:47.545109 17720 master.cpp:309] Master started on > > > > 127.0.1.1:5050 > > > > I0415 18:00:47.545351 17720 master.cpp:324] Master ID: > > > > 201304151800-16842879-5050-17720 > > > > I0415 18:00:47.545819 17720 master.cpp:603] Elected as master! > > > > W0415 18:00:47.546039 17737 master.cpp:81] No whitelist given. > > > Advertising > > > > offers for all slaves > > > > W0415 18:00:52.547684 17736 master.cpp:81] No whitelist given. > > > Advertising > > > > offers for all slaves > > > > W0415 18:00:57.550519 17736 master.cpp:81] No whitelist given. > > > Advertising > > > > offers for all slaves > > > > > > > > se it is not checkpointing! > > > > I0415 18:01:59.379076 17735 hierarchical_allocator_process.hpp:423] > > > Removed > > > > slave 201304151800-16842879-5050-17720-28 > > > > I0415 18:02:00.379822 17737 master.cpp:968] Attempting to register > > slave > > > on > > > > blockmon2 at slave(1)@127.0.1.1:36820 > > > > I0415 18:02:00.380177 17737 master.cpp:1224] Master now considering= a > > > slave > > > > at blockmon2:36820 as active > > > > I0415 18:02:00.380561 17737 master.cpp:1862] Adding slave > > > > 201304151800-16842879-5050-17720-29 at blockmon2 with cpus=3D1; > mem=3D979; > > > > ports=3D[31000-32000]; disk=3D2801 > > > > I0415 18:02:00.380813 17737 hierarchical_allocator_process.hpp:395] > > Added > > > > slave 201304151800-16842879-5050-17720-29 (blockmon2) with cpus=3D1= ; > > > mem=3D979; > > > > ports=3D[31000-32000]; disk=3D2801 (and cpus=3D1; mem=3D979; > > ports=3D[31000-32000]; > > > > disk=3D2801 available) > > > > I0415 18:02:00.381255 17734 master.cpp:537] Slave > > > > 201304151800-16842879-5050-17720-29(blockmon2) disconnected > > > > I0415 18:02:00.381474 17734 master.cpp:542] Removing disconnected > slave > > > > 201304151800-16842879-5050-17720-29(blockmon2) because it is not > > > > checkpointing! > > > > I0415 18:02:00.381882 17735 hierarchical_allocator_process.hpp:423] > > > Removed > > > > slave 201304151800-16842879-5050-17720-29 > > > > > > > > Thanks Guys > > > > > > > > -- > > > > MSc Eduardo Costa Alfaia > > > > PhD Student > > > > Universit=E0 degli Studi di Brescia > > > > > > > > > > > > > > > > -- Vinod > > > > > > > > > On Mon, Apr 15, 2013 at 10:13 AM, Eduardo Alfaia > > > wrote: > > > > > > > Hi Guys, > > > > I am newer in Mesos and I am having some problems when running the > > launch > > > > mesos scripts bellow. Why does the master remove the slave? I have > seen > > > > something about checkpoint. > > > > > > > > MASTER > > > > root@blockmon1:/opt/mesos-trunk/build/bin# ./mesos-master.sh > > > > I0415 18:00:47.543422 17720 main.cpp:116] Build: 2013-04-14 23:48:5= 1 > by > > > > root > > > > I0415 18:00:47.543926 17720 main.cpp:117] Starting Mesos master > > > > I0415 18:00:47.545109 17720 master.cpp:309] Master started on > > > > 127.0.1.1:5050 > > > > I0415 18:00:47.545351 17720 master.cpp:324] Master ID: > > > > 201304151800-16842879-5050-17720 > > > > I0415 18:00:47.545819 17720 master.cpp:603] Elected as master! > > > > W0415 18:00:47.546039 17737 master.cpp:81] No whitelist given. > > > Advertising > > > > offers for all slaves > > > > W0415 18:00:52.547684 17736 master.cpp:81] No whitelist given. > > > Advertising > > > > offers for all slaves > > > > W0415 18:00:57.550519 17736 master.cpp:81] No whitelist given. > > > Advertising > > > > offers for all slaves > > > > > > > > se it is not checkpointing! > > > > I0415 18:01:59.379076 17735 hierarchical_allocator_process.hpp:423] > > > Removed > > > > slave 201304151800-16842879-5050-17720-28 > > > > I0415 18:02:00.379822 17737 master.cpp:968] Attempting to register > > slave > > > on > > > > blockmon2 at slave(1)@127.0.1.1:36820 > > > > I0415 18:02:00.380177 17737 master.cpp:1224] Master now considering= a > > > slave > > > > at blockmon2:36820 as active > > > > I0415 18:02:00.380561 17737 master.cpp:1862] Adding slave > > > > 201304151800-16842879-5050-17720-29 at blockmon2 with cpus=3D1; > mem=3D979; > > > > ports=3D[31000-32000]; disk=3D2801 > > > > I0415 18:02:00.380813 17737 hierarchical_allocator_process.hpp:395] > > Added > > > > slave 201304151800-16842879-5050-17720-29 (blockmon2) with cpus=3D1= ; > > > mem=3D979; > > > > ports=3D[31000-32000]; disk=3D2801 (and cpus=3D1; mem=3D979; > > ports=3D[31000-32000]; > > > > disk=3D2801 available) > > > > I0415 18:02:00.381255 17734 master.cpp:537] Slave > > > > 201304151800-16842879-5050-17720-29(blockmon2) disconnected > > > > I0415 18:02:00.381474 17734 master.cpp:542] Removing disconnected > slave > > > > 201304151800-16842879-5050-17720-29(blockmon2) because it is not > > > > checkpointing! > > > > I0415 18:02:00.381882 17735 hierarchical_allocator_process.hpp:423] > > > Removed > > > > slave 201304151800-16842879-5050-17720-29 > > > > > > > > Thanks Guys > > > > > > > > -- > > > > MSc Eduardo Costa Alfaia > > > > PhD Student > > > > Universit=E0 degli Studi di Brescia > > > > > > > > > > > > > > > -- > > MSc Eduardo Costa Alfaia > > PhD Student > > Universit=E0 degli Studi di Brescia > > > --=20 MSc Eduardo Costa Alfaia PhD Student Universit=E0 degli Studi di Brescia --089e01635502c5c44c04da6a61f2--