Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 700AC1074F for ; Tue, 1 Apr 2014 09:22:30 +0000 (UTC) Received: (qmail 93152 invoked by uid 500); 1 Apr 2014 09:22:29 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 92186 invoked by uid 500); 1 Apr 2014 09:22:23 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 91900 invoked by uid 99); 1 Apr 2014 09:22:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Apr 2014 09:22:20 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mutsuzaki@gmail.com designates 209.85.212.172 as permitted sender) Received: from [209.85.212.172] (HELO mail-wi0-f172.google.com) (209.85.212.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Apr 2014 09:22:15 +0000 Received: by mail-wi0-f172.google.com with SMTP id hi2so3291616wib.11 for ; Tue, 01 Apr 2014 02:21:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:content-transfer-encoding; bh=JyI9eFljR9HUXU8cslalTXfYTB7iDZ9aM9AcAj5Yv4E=; b=do65Kz83QzvtfhaxQnvK1VhQfYa2TrrX2hL5Q/6IPd2N4cY1zxFHvSXZ4g3Ax979gI phgF3bU+Es5G5lpgL9hj8/jQZVLsg9bdObfMGov0vxHWPCYYdvloaeIMWoFqo4Ocf5lF ZQ0ZfcB7OlksdViMswcUn3jk+14wt6QVvuSn3y0WhT3OdxtU5bbos0lZHh1stAeHy67H 8CBxB9HFN9ES1gaSUajVFYTvPPSXs73n8e+Xzdjdcc3sVBUjq/c41DcdW9fHHDHthCW1 9SQcHfFOBEWSp2lvRl1PS62Kj/+CUnCjbkPFMUkn9DPBhsLFXe7sIxn8BPsw+Wv+Gj4O k+pQ== MIME-Version: 1.0 X-Received: by 10.194.5.5 with SMTP id o5mr20831527wjo.16.1396344113584; Tue, 01 Apr 2014 02:21:53 -0700 (PDT) Reply-To: michi@cs.stanford.edu Sender: mutsuzaki@gmail.com Received: by 10.194.172.166 with HTTP; Tue, 1 Apr 2014 02:21:53 -0700 (PDT) In-Reply-To: References: <5E20AF2958E0AF4EBE60C53B3267CE05637FE957@AUSP01DAG0202.collaborationhost.net> Date: Tue, 1 Apr 2014 02:21:53 -0700 X-Google-Sender-Auth: JGqpCZUDiKW_YwbfeXE-6nLwQMc Message-ID: Subject: Re: Thread handling From: Michi Mutsuzaki To: "dev@zookeeper.apache.org" Cc: "user@zookeeper.apache.org" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org +1 for shutting down on a critical thread death. Does 'shutdown' mean calling System.exit or throwing some kind of exception? Some applications use ZooKeeper embedded in their JVM, and they might not like ZooKeeper calling System.exit. --Michi On Mon, Mar 31, 2014 at 9:03 PM, Rakesh R wrote: >>>> This is how I handle the critical threads in my client apps that use Z= ookeeper. >>>> Keep a reference to the thread and periodically make sure it's still a= live and well - respawn it if it is not. > > Thanks Greg for the inputs. Please see ZK-1907, I've included an initial = proposal patch to kick start the discussions. > Another approach is simply shutdown if a critical thread dies, so the mon= itoring tool can easily detect and take necessary actions. The proposed pat= ch is based on this approach. > > -Rakesh > > -----Original Message----- > From: Asta, Greg [mailto:greg.asta@omnigon.com] > Sent: 31 March 2014 23:24 > To: user@zookeeper.apache.org; dev@zookeeper.apache.org > Subject: RE: Thread handling > > " If we have a 'DeathWatcher 'or some other mechanism in place to monitor= all the critical threads. It can take a decision like - bring down the pro= cess if required, or shutdown the quorumpeer and go for LE again etc. > Now the monitoring or management tool will knows about the situation and = can act upon. > > Appreciate any thoughts ?" > > This is how I handle the critical threads in my client apps that use Zook= eeper. Keep a reference to the thread and periodically make sure it's stil= l alive and well - respawn it if it is not. > > Thanks, > Greg > > > -----Original Message----- > From: Rakesh R [mailto:rakeshr@huawei.com] > Sent: Thursday, March 27, 2014 10:39 AM > To: dev@zookeeper.apache.org; user@zookeeper.apache.org > Subject: Thread handling > > Hi All, > > Server has many critical threads running and co-ordinating each other lik= e RequestProcessor chains et. When going through each threads, most of the= m having the similar structure like: > > public void run() { > try { > while(running) > // processing logic > } > } catch (InterruptedException e) { > LOG.error("Unexpected interruption", e); > } catch (RequestProcessorException e) { > LOG.error("Unexpected exception", e); > } catch (Exception e) { > LOG.error("Unexpected exception", e); > } > LOG.info("...exited loop!"); > } > > I feel, we could improve our threads in our system. From the design I co= uld see, there could be a chance of silently leaving the thread in case of = any exception(abnormal or any functional issue too) If this happens in the = production, the server would get hanged forever and will not be able to del= iver its role. > > If we have a 'DeathWatcher 'or some other mechanism in place to monitor a= ll the critical threads. It can take a decision like - bring down the proce= ss if required, or shutdown the quorumpeer and go for LE again etc. > Now the monitoring or management tool will knows about the situation and = can act upon. > > Appreciate any thoughts ? > > Thanks in advance, > Rakesh R