Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 27485 invoked from network); 2 Dec 2008 19:23:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Dec 2008 19:23:37 -0000 Received: (qmail 33578 invoked by uid 500); 2 Dec 2008 19:23:47 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 33542 invoked by uid 500); 2 Dec 2008 19:23:47 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 33463 invoked by uid 99); 2 Dec 2008 19:23:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Dec 2008 11:23:46 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Dec 2008 19:22:26 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 733DC234C2A5 for ; Tue, 2 Dec 2008 11:22:44 -0800 (PST) Message-ID: <1629619245.1228245764470.JavaMail.jira@brutus> Date: Tue, 2 Dec 2008 11:22:44 -0800 (PST) From: "Raghu Angadi (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-4679) Datanode prints tons of log messages: Waiting for threadgroup to exit, active theads is XX In-Reply-To: <574912164.1227036464317.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652467#action_12652467 ] Raghu Angadi commented on HADOOP-4679: -------------------------------------- # writeToBlock() creates files in two places. The patch catches only one of them. # There is inherent requirement that shutdown() should only be called from offerService thread. It would be better if JavaDoc for shutdown() says this explicitly. Otherwise, this deadlock and logging in tight infinite loop could occur again with future changes. > Datanode prints tons of log messages: Waiting for threadgroup to exit, active theads is XX > ------------------------------------------------------------------------------------------ > > Key: HADOOP-4679 > URL: https://issues.apache.org/jira/browse/HADOOP-4679 > Project: Hadoop Core > Issue Type: Bug > Components: dfs > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Attachments: diskError.patch, diskError1.patch, diskError2.patch > > > When a data receiver thread sees a disk error, it immediately calls shutdown to shutdown DataNode. But the shutdown method does not return before all data receiver threads exit, which will never happen. Therefore the DataNode gets into a dead/live lock state, emitting tons of log messages: Waiting for threadgroup to exit, active threads is XX. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.