Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5E728200D3C for ; Tue, 14 Nov 2017 19:06:06 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 5D03D160C07; Tue, 14 Nov 2017 18:06:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A1BAD160BD7 for ; Tue, 14 Nov 2017 19:06:05 +0100 (CET) Received: (qmail 27715 invoked by uid 500); 14 Nov 2017 18:06:04 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 27702 invoked by uid 99); 14 Nov 2017 18:06:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Nov 2017 18:06:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 9B4D7C8908 for ; Tue, 14 Nov 2017 18:06:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.8 X-Spam-Level: X-Spam-Status: No, score=-0.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.8, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gridgain-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id N3ULvRjqaLOP for ; Tue, 14 Nov 2017 18:06:01 +0000 (UTC) Received: from mail-wr0-f174.google.com (mail-wr0-f174.google.com [209.85.128.174]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 41A875FDCE for ; Tue, 14 Nov 2017 18:06:01 +0000 (UTC) Received: by mail-wr0-f174.google.com with SMTP id u40so18251989wrf.10 for ; Tue, 14 Nov 2017 10:06:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gridgain-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=dT92IvT6kocJt6GE2VXdaMtgN1V4spVdfqd53Da6jjo=; b=I0fJU56OBkU0x5bl+7TaTox5D+3zyXF4o8gKrpMNOI41jqfgz07W/+dRatOWO+43Sm libKkDlw1YM4HF7AhNrcoN5E3GKN2i2nsnYhpsArgX/PzRtAPeaG45QipIR55wUm6Tsv Mi9h1MajZ5TMU6WGFjGdkafFRs+SJUBYzwmbJGTY1DHs1AKJ3LGoY8iSM1xsyRldrIvN tcoBbESl0uj6QEgyaIxSbwo9eqf5NRov379upH7CpAOPsp3vzP4e6i30fdT/KI4Z+W+d IOZIvw4180KuS79e7ArjSP6M3xNp3QF+ngNwNw0f2g5ye7/VinW+B4Wkn0GVTpvAlJSU 5pQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=dT92IvT6kocJt6GE2VXdaMtgN1V4spVdfqd53Da6jjo=; b=ZnNVhmOszUI4IBiYv1daovOSNQS5Y5uL4Ghdo1KF6gs3cfDuy9AdlJC8jtCBecvhav Imt9QXBvNqmge0Ptj3KvKUjONqA+896IriA7OliuuK01v0xjjH0NG5ALUYhxgVWS13NL +X6CAs4wwrPG7oKnCapzqagUGITcY0tt9Nfev67eC0N7oItaHmZhcHbQS0rZEBL8Hi9Y /CxXn52FllmDUfVYCjjVlMY7sHnoPzSJMZsKb7RorBvmyPZSPgCCS+IFHYnP1xbZxGCU eu9PISEVZtZgF1N2H+zsI76o+RN/Ad0qVdMW+hVqypxJ9RHcj3NdrtVVIOyHO2s32ODt n/Aw== X-Gm-Message-State: AJaThX41IzLVDmJh2bfQhRKO4DjLVPKGTV8pM7xOFjMfqj0QPFRi3nYm RWQyNBBwWouXwNHdjCYQ+DXJbiHERn8f8Jp74VVS X-Google-Smtp-Source: AGs4zMYI/J3UNT786obAoRIg4T/Ml7gsRYtzdAsBnAy2SoHmueK/7ATgStW/F5f+ESqDQ+OjF6wsxxTmbQrRAE+fhpU= X-Received: by 10.223.196.194 with SMTP id o2mr10133503wrf.246.1510682759894; Tue, 14 Nov 2017 10:05:59 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Anton Vinogradov Date: Tue, 14 Nov 2017 18:05:49 +0000 Message-ID: Subject: Re: Add emergency node closing handler to public Ignite API To: dev@ignite.apache.org Content-Type: multipart/alternative; boundary="f403045f7ca845f4a3055df53d38" archived-at: Tue, 14 Nov 2017 18:06:06 -0000 --f403045f7ca845f4a3055df53d38 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Vova, We should provide user ability to be notified in case some node decided to stop itself. Only user know how he want to be notified, so we should provide ability to register custom callback(eg. send sms or call rest service) This will cover cases when node stops gracefuly. Please, see Semen's comment at https://issues.apache.org/jira/browse/IGNITE-5811 for details. P.s. Cases when node stops without ability to do something should be covered by external watchdog. =D0=92=D1=82, 14 =D0=BD=D0=BE=D1=8F=D0=B1. 2017 =D0=B3. =D0=B2 20:08, Vladi= mir Ozerov : > Can you explain what kind of logic could be placed there? And why do we > need another configuration property and/or interface? We already have > LifecycleBean, where Ignite instance could be injected, so user is alread= y > able to perform anything there. > > On Tue, Nov 14, 2017 at 7:46 PM, Anton Vinogradov < > avinogradov@gridgain.com> > wrote: > > > Vova, > > > > That's not about "kill -9" or OOM, that's about case when node detected > > something and decided to stop itself (eg. persistence errors, > > IgniteOutOfMemoryException, ExchangeWorker died) > > Sure, we can't handle OOM or 100% CPU utilization by GC it that way, bu= t > we > > can handle some logical problems. > > > > Andrey, > > > > I propose to refactor method to ignite.onClose(SomeClosure) > > In this case user will be able to register callback on all graceful > stops, > > and > > detect it's reason. > > > > > > On Tue, Nov 14, 2017 at 7:29 PM, Vladimir Ozerov > > wrote: > > > > > I am not sure this makes sense. First, in general case we do not have > > > access to Java. E.g. in case of very long GC pause all Java threads a= re > > > stuck and it is impossible to invoke anything. Second, some other > > > conditions may be unrecoverable, such as OOME, where there is no > > guarantee > > > that any operation succeed. So this is not graceful shutdown. We shou= ld > > > kill the node forcefully IMO. > > > > > > On Tue, Nov 14, 2017 at 7:23 PM, Andrey Kuznetsov > > > wrote: > > > > > > > Hi Igniters! > > > > > > > > When some node detects critical error, e.g. OOME, deadlock, etc, it > > > should > > > > invoke some user-defined callback and then attempt to close itself > > > > gracefully. In order to make this possible we need to enhance Ignit= e > > > > interface by adding something like Ignite.onEmergencyClose( > > SomeClosure). > > > > > > > > First, I'd like to get your feedback on this potential change. Then > we > > > can > > > > refine SomeClosure structure. > > > > > > > > -- > > > > Best regards, > > > > Andrey Kuznetsov. > > > > > > > > > > --f403045f7ca845f4a3055df53d38--