Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 22173 invoked from network); 20 Dec 2010 21:49:23 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 Dec 2010 21:49:23 -0000 Received: (qmail 28817 invoked by uid 500); 20 Dec 2010 21:49:23 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 28790 invoked by uid 500); 20 Dec 2010 21:49:23 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 28782 invoked by uid 99); 20 Dec 2010 21:49:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Dec 2010 21:49:23 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Dec 2010 21:49:22 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oBKLn27I004495 for ; Mon, 20 Dec 2010 21:49:02 GMT Message-ID: <19760173.223841292881742119.JavaMail.jira@thor> Date: Mon, 20 Dec 2010 16:49:02 -0500 (EST) From: "Amit Nithian (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Updated: (HDFS-1542) Deadlock in Configuration.writeXml when serialized form is larger than one DFS block In-Reply-To: <17263032.161191292522345510.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amit Nithian updated HDFS-1542: ------------------------------- Attachment: deadlock.txt Attached is an example of the thread dump of the jobtracker when submitting a job that causes the tracker to deadlock. This is consistent with the one that was posted on the mailing lists to kick off this issue. I tried to increase my block size to 128M to no avail. BTW how can I use your test program to reproduce the problem on my cluster? I successfully ran it locally but am not sure how I can run this in a distributed mode to test. > Deadlock in Configuration.writeXml when serialized form is larger than one DFS block > ------------------------------------------------------------------------------------ > > Key: HDFS-1542 > URL: https://issues.apache.org/jira/browse/HDFS-1542 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Affects Versions: 0.20.2, 0.22.0, 0.23.0 > Reporter: Todd Lipcon > Priority: Critical > Attachments: deadlock.txt, Test.java > > > Configuration.writeXml holds a lock on itself and then writes the XML to an output stream, during which DFSOutputStream will try to get a lock on ackQueue/dataQueue. Meanwihle the DataStreamer thread will call functions like conf.getInt() and deadlock against the other thread, since it could be the same conf object. > This causes a deterministic deadlock whenever the serialized form is larger than block size. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.