From dev-return-31898-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Tue Mar 13 01:12:53 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id F242918064D for ; Tue, 13 Mar 2018 01:12:52 +0100 (CET) Received: (qmail 49132 invoked by uid 500); 13 Mar 2018 00:12:51 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 49116 invoked by uid 99); 13 Mar 2018 00:12:51 -0000 Received: from mail-relay.apache.org (HELO mailrelay1-lw-us.apache.org) (207.244.88.152) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Mar 2018 00:12:51 +0000 Received: from mail-qt0-f178.google.com (mail-qt0-f178.google.com [209.85.216.178]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id EC774B0F for ; Tue, 13 Mar 2018 00:12:50 +0000 (UTC) Received: by mail-qt0-f178.google.com with SMTP id a26so2282050qtj.6 for ; Mon, 12 Mar 2018 17:12:50 -0700 (PDT) X-Gm-Message-State: AElRT7Hxlms28Bphgixt1WnfkqigfRDhiNP/66cCXl2UN9+7qkTpgQ5J eDOfmQdrLkITqHwxZiftLtdn156ocMYauruxOQZ8aA== X-Google-Smtp-Source: AG47ELtLvRmBK6LOoolaFQWOz8MfL6NmlXTM+jweUzkii8lPNBTOYWwB5W5HfIhMcalu5ZtLvwy7x9NbtScGHM9FIg4= X-Received: by 10.237.35.76 with SMTP id i12mr14861232qtc.134.1520899969912; Mon, 12 Mar 2018 17:12:49 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.185.143 with HTTP; Mon, 12 Mar 2018 17:12:19 -0700 (PDT) In-Reply-To: References: From: Denis Magda Date: Mon, 12 Mar 2018 17:12:19 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: IEP-14: Ignite failures handling (Discussion) To: dev@ignite.apache.org Content-Type: multipart/alternative; boundary="001a113c8026722beb0567401e55" --001a113c8026722beb0567401e55 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Dmitriy, Ignite client node is usually used in the embedded mode. By killing the whole process, the node is running in, we're going to kill the entire application. That doesn't sound like a good plan. That's why my suggestion is to try to kill the node somehow instead rather than the whole process. As for the server nodes, which usually own the whole process, it's totally fine to kill the process right away. -- Denis On Mon, Mar 12, 2018 at 4:12 PM, Dmitriy Setrakyan wrote: > Denis, what is the difference between killing the process and killing the > node and the process? > > D. > > On Mon, Mar 12, 2018 at 12:03 PM, Denis Magda wrote: > > > Guys, > > > > I would make a decision depending on a type of the problematic node: > > > > - If it's a *server node*, then let's kill the process simply becaus= e > > the node usually owns the whole process. Don't see a practical reaso= n > > why a > > user wants to run 2 server nodes in a single process. > > - If it's a *client node*, then the best approach is to kill the nod= e > > and not the process. > > > > -- > > Denis > > > > On Mon, Mar 12, 2018 at 3:04 AM, Dmitry Pavlov > > wrote: > > > > > Hi Andrey, Igniters, > > > > > > Thank you for starting this topic, because this is really important > > > decision. > > > > > > JVM termination in case Ignite is started within application server > with > > > other application will kill all services started. > > > > > > So I suggest this option is not default. We can add this option > > > (action=3D"JVM termination") as pre-configured for ignite.sh/bat sinc= e > we > > > know is it separate JVM. But I do not vote for the option, if it was > the > > > default in code. > > > > > > Sincerely, > > > Dmitriy Pavlov > > > > > > =D0=BF=D0=BD, 12 =D0=BC=D0=B0=D1=80. 2018 =D0=B3. =D0=B2 12:57, Andre= y Kuznetsov : > > > > > > > To my mind, the default action should be as severe as possible, sin= ce > > we > > > > deal with critical errors, that is, entire JVM termination. In the > case > > > of > > > > some custom setup (e.g. different cluster nodes in one JVM) failure > > > > response action should be configured explicitly. > > > > > > > > 2018-03-12 12:32 GMT+03:00 Andrey Gura : > > > > > > > > > Igniters! > > > > > > > > > > We are working on proposal described in IEP-14 Ignite failures > > > > > handling [1] and it's time to discuss it with community (although > it > > > > > was necessary to do this before). > > > > > > > > > > Most important question: what should be default behaviour in case > of > > > > > failure? There are 4 actions: > > > > > > > > > > 1. Restart JVM process (it's possible only if process was started > > from > > > > > ignite.(sh|bat) script) > > > > > 2. Terminate JVM; > > > > > 3. Stop node (if there is only one node in process then process > will > > > > > be also terminated); > > > > > 4. No operation. > > > > > > > > > > I believe that node should be stopped by default. But there is > chance > > > > > that node will not stopped correctly. > > > > > > > > > > May be we should terminate JVM process by default. But it will ki= ll > > > > > all nodes in the JVM process. It's especially bad behaviour in ca= se > > > > > when nodes belong different Ignite clusters (real use case). > > > > > > > > > > May be we should restart JVM process default. This approach has t= he > > > > > same problems as the previous one. And additionally it could lead > to > > > > > continues restarts and, therefore, continues exchanges and > > > > > rebalancing. > > > > > > > > > > Difficult choice. Could you please share your thoughts. > > > > > > > > > > [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP- > > > > > 14+Ignite+failures+handling > > > > > > > > > > > > > > > > > > > > > -- > > > > Best regards, > > > > Andrey Kuznetsov. > > > > > > > > > > --001a113c8026722beb0567401e55--