Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2636810204 for ; Wed, 28 May 2014 20:07:02 +0000 (UTC) Received: (qmail 24385 invoked by uid 500); 28 May 2014 20:07:01 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 24276 invoked by uid 500); 28 May 2014 20:07:01 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 24167 invoked by uid 500); 28 May 2014 20:07:01 -0000 Delivered-To: apmail-incubator-giraph-dev@incubator.apache.org Received: (qmail 24072 invoked by uid 99); 28 May 2014 20:07:01 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 May 2014 20:07:01 +0000 Date: Wed, 28 May 2014 20:07:01 +0000 (UTC) From: "Sergey Edunov (JIRA)" To: giraph-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (GIRAPH-903) Detect crashes of Netty threads MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GIRAPH-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011514#comment-14011514 ] Sergey Edunov commented on GIRAPH-903: -------------------------------------- CR: https://reviews.apache.org/r/21987/ > Detect crashes of Netty threads > ------------------------------- > > Key: GIRAPH-903 > URL: https://issues.apache.org/jira/browse/GIRAPH-903 > Project: Giraph > Issue Type: Bug > Reporter: Sergey Edunov > Priority: Minor > Attachments: GIRAPH-903.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > When some of the request processing threads fails, the worker gets stuck but the job doesn't fail and it has to be killed manually. We should detect netty thread crashes and fail the job automatically. > You can easily reproduce this if you add a mistake to deserialization of messages for example. -- This message was sent by Atlassian JIRA (v6.2#6252)