Return-Path: X-Original-To: apmail-incubator-mesos-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-mesos-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C1A3BE255 for ; Wed, 20 Feb 2013 01:45:35 +0000 (UTC) Received: (qmail 31039 invoked by uid 500); 20 Feb 2013 01:45:35 -0000 Delivered-To: apmail-incubator-mesos-dev-archive@incubator.apache.org Received: (qmail 30952 invoked by uid 500); 20 Feb 2013 01:45:35 -0000 Mailing-List: contact mesos-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mesos-dev@incubator.apache.org Delivered-To: mailing list mesos-dev@incubator.apache.org Received: (qmail 30940 invoked by uid 99); 20 Feb 2013 01:45:35 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Feb 2013 01:45:35 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id AED561C6F25; Wed, 20 Feb 2013 01:45:27 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============3875938873103416904==" MIME-Version: 1.0 Subject: Re: Review Request: Slave Restart (Part Five): Implemented non-child process monitoring in reaper From: "Vinod Kone" To: "Benjamin Hindman" , "Ben Mahler" Cc: "mesos" , "Vinod Kone" Date: Wed, 20 Feb 2013 01:45:27 -0000 Message-ID: <20130220014527.18987.99791@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Vinod Kone" X-ReviewGroup: mesos X-ReviewRequest-URL: https://reviews.apache.org/r/8570/ X-Sender: "Vinod Kone" References: <20130215020227.21521.65655@reviews.apache.org> In-Reply-To: <20130215020227.21521.65655@reviews.apache.org> Reply-To: "Vinod Kone" --===============3875938873103416904== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable > On Feb. 15, 2013, 2:02 a.m., Ben Mahler wrote: > > src/slave/reaper.cpp, line 103 > > > > > > You're passing -1 to the listeners here! Can you propagate an error= instead when the call fails? > > = > > Right now you log the warning, but then notify anyway. This should = either not notify them, or notify them of the error, I lack the context to = know which is more appropriate atm. We definitely should notify them, because the process exited. We send -1 st= atus, when we can't determine status. The slave appropriately logs "unknown= signal" when it gets this executorTerminated(). > On Feb. 15, 2013, 2:02 a.m., Ben Mahler wrote: > > third_party/libprocess/include/stout/proc.hpp, line 33 > > > > > > What else do you see going in here? > > = > > I'd imagine linux/proc.hpp moving to stout, at which point do you e= xpect to consolidate the two? If not, then we may want to move this functio= n into os.hpp. I can imagine things like checking if a process is orphaned, sending signal= s (w/ appropriate error checking) etc to go here. Basically any helpers tha= t make it easy to deal with processes. linux/proc.hpp is linux specific. So, when we move it into stout, I would i= magine it be under linux directory in stout. I don't like to include it in os.hpp because 1) its already bloated and 2) = proc would be a useful abstraction to deserve its own file. = > On Feb. 15, 2013, 2:02 a.m., Ben Mahler wrote: > > src/tests/process_spawn.cpp, line 28 > > > > > > This file is outside our coding conventions, can you add a TODO to = clean it up? cleaned up. - Vinod ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8570/#review16624 ----------------------------------------------------------- On Feb. 14, 2013, 3:30 a.m., Vinod Kone wrote: > = > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/8570/ > ----------------------------------------------------------- > = > (Updated Feb. 14, 2013, 3:30 a.m.) > = > = > Review request for mesos, Benjamin Hindman and Ben Mahler. > = > = > Description > ------- > = > Needed this to properly monitor the exit status of re-connected executors= , as they will be parented by INIT. = > = > = > Diffs > ----- > = > src/Makefile.am c94736df660a25b58dc47c07d9c56c3c26152a66 = > src/linux/proc.hpp e0825a4a9f9e2763e0c25d7319f220bfe7c7c29c = > src/linux/proc.cpp 8a0fc48dc9769df35d682ece477246b2df2fc0d4 = > src/sched/sched.cpp f1eeab6f12ee300d77013c6a4ba62ccd7fdb0d1d = > src/slave/cgroups_isolation_module.cpp 14f549edaf1b37a6bca8f75309864333= ae775e7c = > src/slave/lxc_isolation_module.cpp 30cff2a49339bb07030727d30352536a0a22= d58c = > src/slave/process_based_isolation_module.cpp 12a579cba56cd3dac384bc7919= b0d5537b0e429d = > src/slave/reaper.hpp b9aa62daa42bdaa736ade43884982529ba3d4bb1 = > src/slave/reaper.cpp c0ee4b4c07fd792bcb39455b666808b712eb32c2 = > src/slave/slave.cpp d4721c3eb51db87278d05f6fbe2eadb8a3a9b4dd = > src/slave/solaris_project_isolation_module.cpp f3b6a68926af34c46873d8de= 1c9858480f42ef98 = > src/tests/cgroups_tests.cpp b219906374764e91f1a5268469ae92dd0fe08e53 = > src/tests/master_tests.cpp 948ab5dff34eeba1f3ce593a864ddf282c8b19ed = > src/tests/process_spawn.cpp 04e836f8eed7312dbee27e20399e7d0e59df0bc2 = > src/tests/reaper_tests.cpp PRE-CREATION = > src/tests/script.cpp ebd2ab52e4de2dac744712b5adb1107a33ed29df = > src/tests/utils.hpp be457117515ee727af101370b26bf9188afb8f45 = > third_party/libprocess/Makefile.am dad1b65c3fdb7dbdad4e7c3d9c241cc4e89c= 3325 = > third_party/libprocess/include/stout/proc.hpp PRE-CREATION = > = > Diff: https://reviews.apache.org/r/8570/diff/ > = > = > Testing > ------- > = > make check > = > = > Thanks, > = > Vinod Kone > = > --===============3875938873103416904==--