Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 30597200BD0 for ; Wed, 30 Nov 2016 21:21:40 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 2F081160B06; Wed, 30 Nov 2016 20:21:40 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 76B0C160B13 for ; Wed, 30 Nov 2016 21:21:39 +0100 (CET) Received: (qmail 22602 invoked by uid 500); 30 Nov 2016 20:21:38 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 22592 invoked by uid 99); 30 Nov 2016 20:21:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Nov 2016 20:21:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 352DC1800EC for ; Wed, 30 Nov 2016 20:21:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.174 X-Spam-Level: ** X-Spam-Status: No, score=2.174 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_SOFTFAIL=0.972, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id bqiVUybXn123 for ; Wed, 30 Nov 2016 20:21:36 +0000 (UTC) Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 3BD5A5F589 for ; Wed, 30 Nov 2016 20:21:36 +0000 (UTC) Received: from static.162.255.23.37.macminivault.com (unknown [162.255.23.37]) by mwork.nabble.com (Postfix) with ESMTP id 36E80730A34C6 for ; Wed, 30 Nov 2016 13:20:35 -0700 (MST) Date: Wed, 30 Nov 2016 13:20:35 -0700 (MST) From: "javastuff.sam@gmail.com" To: user@ignite.apache.org Message-ID: <1480537235215-9309.post@n6.nabble.com> In-Reply-To: <1480383831367-9247.post@n6.nabble.com> References: <1479151590411-8965.post@n6.nabble.com> <1479168414485-8976.post@n6.nabble.com> <1479172781560-8978.post@n6.nabble.com> <1479207901006-8991.post@n6.nabble.com> <1479251426425-9010.post@n6.nabble.com> <1479312294739-9023.post@n6.nabble.com> <1479767300647-9117.post@n6.nabble.com> <1480378078543-9243.post@n6.nabble.com> <1480383831367-9247.post@n6.nabble.com> Subject: Re: Cluster hung after a node killed MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit archived-at: Wed, 30 Nov 2016 20:21:40 -0000 Hi Val, Killing the node that acquired the lock did not release it automatically and leads whole cluster in hung state, any operation on any cache (not related to lock) are in wait state. Cluster is not able to recover seamlessly. Looks like a bug to me. I understand lock timeout can be error-prone, if configured correctly then lock timeout can provide a second way of auto recovery in such failover cases. Is there any way to configure timeouts on lock? Explicit lock is one of the usecase we have, but Cluster auto recovery during any change to cluster is the most important, so if these 2 not going together then its a show stopper. -Sam -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Cluster-hung-after-a-node-killed-tp8965p9309.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.