Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 51302 invoked from network); 4 Feb 2009 23:33:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Feb 2009 23:33:24 -0000 Received: (qmail 69552 invoked by uid 500); 4 Feb 2009 23:33:17 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 69512 invoked by uid 500); 4 Feb 2009 23:33:17 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 69501 invoked by uid 99); 4 Feb 2009 23:33:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Feb 2009 15:33:17 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [216.145.54.173] (HELO mrout3.yahoo.com) (216.145.54.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Feb 2009 23:33:08 +0000 Received: from [10.72.106.226] (heighthigh-lx.corp.yahoo.com [10.72.106.226]) by mrout3.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id n14NWIqT011823 for ; Wed, 4 Feb 2009 15:32:18 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:subject: references:in-reply-to:content-type:content-transfer-encoding; b=RQRJeWZo5nCLiwCvnHhAaMHDF9i3a1hXSKZh4iqKbAqO4gglwjWNXGqbQFPdM63Z Message-ID: <498A2582.605@yahoo-inc.com> Date: Wed, 04 Feb 2009 15:32:18 -0800 From: Raghu Angadi User-Agent: Thunderbird 2.0.0.19 (Windows/20081209) MIME-Version: 1.0 To: core-user@hadoop.apache.org Subject: Re: problem with completion notification from block movement References: <1233334799.5164.80.camel@awol.kleinpaste.org> <314098690902011758u2ac19fbev8fd851757eb6bcff@mail.gmail.com> <1233599004.16154.130.camel@awol.kleinpaste.org> In-Reply-To: <1233599004.16154.130.camel@awol.kleinpaste.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Karl Kleinpaste wrote: > On Sun, 2009-02-01 at 17:58 -0800, jason hadoop wrote: >> The Datanode's use multiple threads with locking and one of the >> assumptions is that the block report (1ce per hour by default) takes >> little time. The datanode will pause while the block report is running >> and if it happens to take a while weird things start to happen. > > Thank you for responding, this is very informative for us. > > Having looked through the source code with a co-worker regarding > periodic scan and then checking the logs once again, we find that we are > finding reports of this sort: > > BlockReport of 1158499 blocks got processed in 308860 msecs > BlockReport of 1159840 blocks got processed in 237925 msecs > BlockReport of 1161274 blocks got processed in 177853 msecs > BlockReport of 1162408 blocks got processed in 285094 msecs > BlockReport of 1164194 blocks got processed in 184478 msecs > BlockReport of 1165673 blocks got processed in 226401 msecs > > The 3rd of these exactly straddles the particular example timeline I > discussed in my original email about this question. I suspect I'll find > more of the same as I look through other related errors. You could ask for "complete fix" in https://issues.apache.org/jira/browse/HADOOP-4584 . I don't think current patch there fixes your problem. Raghu. > --karl >