Return-Path: Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: (qmail 75093 invoked from network); 21 Dec 2009 00:49:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 21 Dec 2009 00:49:30 -0000 Received: (qmail 35972 invoked by uid 500); 21 Dec 2009 00:49:30 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 35875 invoked by uid 500); 21 Dec 2009 00:49:29 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 35865 invoked by uid 99); 21 Dec 2009 00:49:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Dec 2009 00:49:29 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of saint.ack@gmail.com designates 209.85.221.188 as permitted sender) Received: from [209.85.221.188] (HELO mail-qy0-f188.google.com) (209.85.221.188) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Dec 2009 00:49:22 +0000 Received: by qyk26 with SMTP id 26so2313735qyk.5 for ; Sun, 20 Dec 2009 16:49:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type; bh=UsIO2CMeHOnOFvvHlqPNzcPh0eVqy5XhJSnIeqAHQ2c=; b=kcG+yC+Bd0uKB7AqqlOtIQDqlyFjx8XmuBU0yhZIsgC467BVkd2HNEWFMaaOznZhZc qOt0h1383qCD7EAQKVsYNWdtbqN7nWfx5R7jGnzdmkUzm8/gIUh9jfBRlgappZfH6Dph lQ5SOL39xcCXoA/4q230S5wOBCwXkTzHVCVSY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; b=rQMp8rM501rs/nGmrFSUi3L/rrxkyjbQ/ITG2KHEQ/zr9ovm6W3Wb+SZNwzZI5tcAb ALR1ltBp6GUxCJ53og2unawltms94wbNWXxLtsYrdlbYVqoAZUGuaZjAEHMXAH+l4ZjS dYcvS2NxtYjAVAEkYmbNbVSdj1tQF5Wo/QxFI= MIME-Version: 1.0 Sender: saint.ack@gmail.com Received: by 10.229.118.135 with SMTP id v7mr3187939qcq.62.1261356541651; Sun, 20 Dec 2009 16:49:01 -0800 (PST) In-Reply-To: <7c962aed0912142156q5253bfcfj42897bcb875e9c15@mail.gmail.com> References: <7c962aed0912142156q5253bfcfj42897bcb875e9c15@mail.gmail.com> Date: Sun, 20 Dec 2009 16:49:01 -0800 X-Google-Sender-Auth: f3f14b5c82814935 Message-ID: <7c962aed0912201649l40d1e1arb712f99860dba94@mail.gmail.com> Subject: Re: [VOTE CANCELLED] Commit hdfs-630 to 0.21? From: stack To: hdfs-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0cd6d13299add0047b32748d --000e0cd6d13299add0047b32748d Content-Type: text/plain; charset=ISO-8859-1 Nicholas reviewed hdfs-630 patch and made some suggestions for improvements. Cosmin, the patch writer, obliged. After chatting with Nicholas and Cosmin, I will reverse the hdfs-630 patch that is in TRUNK and if the new patch passes hudson, will apply it instead. I will then put up a new vote to have the improved patch applied to 0.21. Thanks to all who voted. St.Ack On Mon, Dec 14, 2009 at 9:56 PM, stack wrote: > I'd like to propose a vote on having hdfs-630 committed to 0.21 (Its > already been committed to TRUNK). > > hdfs-630 adds having the dfsclient pass the namenode the name of datanodes > its determined dead because it got a failed connection when it tried to > contact it, etc. This is useful in the interval between datanode dying and > namenode timing out its lease. Without this fix, the namenode can often > give out the dead datanode as a host for a block. If the cluster is small, > less than 5 or 6 nodes, then its very likely namenode will give out the dead > datanode as a block host. > > Small clusters are common in hbase, especially when folks are starting out > or evaluating hbase. They'll start with three or four nodes carrying both > datanodes+hbase regionservers. They'll experiment killing one of the slaves > -- datanodes and regionserver -- and watch what happens. What follows is a > struggling dfsclient trying to create replicas where one of the datanodes > passed us by the namenode is dead. DFSClient will fail and then go back to > the namenode again, etc. (See > https://issues.apache.org/jira/browse/HBASE-1876 for more detailed > blow-by-blow). HBase operation will be held up during this time and > eventually a regionserver will shut itself down to protect itself against > dataloss if we can't successfully write HDFS. > > Thanks all, > St.Ack --000e0cd6d13299add0047b32748d--