Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 84964 invoked from network); 16 Nov 2008 23:35:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Nov 2008 23:35:24 -0000 Received: (qmail 20408 invoked by uid 500); 16 Nov 2008 23:35:31 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 20339 invoked by uid 500); 16 Nov 2008 23:35:30 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 20324 invoked by uid 99); 16 Nov 2008 23:35:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Nov 2008 15:35:30 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [76.96.30.40] (HELO QMTA04.emeryville.ca.mail.comcast.net) (76.96.30.40) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Nov 2008 23:34:07 +0000 Received: from OMTA05.emeryville.ca.mail.comcast.net ([76.96.30.43]) by QMTA04.emeryville.ca.mail.comcast.net with comcast id fv3y1a00W0vp7WLA4zarKS; Sun, 16 Nov 2008 23:34:51 +0000 Received: from [10.0.0.13] ([209.131.62.115]) by OMTA05.emeryville.ca.mail.comcast.net with comcast id fzaX1a00E2VBGtd8RzacYy; Sun, 16 Nov 2008 23:34:46 +0000 X-Authority-Analysis: v=1.0 c=1 a=qvBk6BSs0IXO491EkrMA:9 a=7tcWBy6bxMdM02zUmCbMeHkIWL4A:4 a=WuK_CZDBSqoA:10 Message-Id: From: Owen O'Malley To: core-dev@hadoop.apache.org In-Reply-To: <1466c1d60811161309w196d7f88of2c8c9cdfb98d6c4@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Re: Removing JMM issues Date: Sun, 16 Nov 2008 15:34:30 -0800 References: <1466c1d60811161309w196d7f88of2c8c9cdfb98d6c4@mail.gmail.com> X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org On Nov 16, 2008, at 1:09 PM, Peter Veentjer wrote: > I have had a quick look at the source code of Hadoop and it appears > there there are some issues with the JMM. In some places it is done > correctly, in some places partially and in some places it incorrect. That is believable. Clearly some of the problems have been fixed, but Hadoop is moving fast and new code is being added every week. There have been bug reports on concurrency stuff that turned out to be false positives. *smile* It is amazing how much testing code gets when you run it on 20,000 nodes 24x7 and even rare cases have been hit in production. > There also are some design issues with concurrency as well and I think > the Hadoop project could benefit from overall solution instead of just > putting out small fires. Yeah, we've talked about it for a long time. (See HADOOP-869.) The particularly problematic parts of the call graph in the JobTracker are when the lower levels call back into the higher layers. We've had to be careful to preserve lock order in those cases to avoid deadlock. > So who are the guys to get in touch with? This list is exactly the right guys. > Together with the Hadoop developers I want to further improve the > quality of this very interesting project. Jump right in. *Smile* -- Owen