Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 24688 invoked from network); 24 Jun 2009 01:18:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Jun 2009 01:18:59 -0000 Received: (qmail 51070 invoked by uid 500); 24 Jun 2009 01:19:08 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 50985 invoked by uid 500); 24 Jun 2009 01:19:07 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 50975 invoked by uid 99); 24 Jun 2009 01:19:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jun 2009 01:19:07 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=FS_REPLICA,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of stuart.white1@gmail.com designates 209.85.217.215 as permitted sender) Received: from [209.85.217.215] (HELO mail-gx0-f215.google.com) (209.85.217.215) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jun 2009 01:18:59 +0000 Received: by gxk11 with SMTP id 11so733651gxk.5 for ; Tue, 23 Jun 2009 18:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=1u+w2OEmije6mprafjW2a3YolGQX0Eq1HQIItSd2Kiw=; b=pmwhmIY89ZdjmitctF6AoasGCOMXe4T2f/lHhtfc2pEJoIYUXF+me5Kv8qP+JIE1SI jBfAQkKoNvw0mVpYUN9Ym7TVxSi6n74LB6kymccrB6Dhq+HiAn+U+pg4i4RgdabiyVC+ uBQRYFsZzvVnHwqDXG3IM20J+vBSMyHlZsCI4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=iVEERv71L9ED4CudheuJql5hUBfhV6UHpjj/b0PXes8y8nNrcFu6sYTttN5j+umHu6 mWX6kQ9jRUQUzXpSx5W6vc+sQOOwVudTdJQkYzMAC1hqAEp9giogMTsAn7XkId4/BKkX 6yyWEMZhAYZgUzEVsX5DBH0MnBEXZpiG9uA+4= MIME-Version: 1.0 Received: by 10.150.49.4 with SMTP id w4mr1420750ybw.9.1245806319270; Tue, 23 Jun 2009 18:18:39 -0700 (PDT) Date: Tue, 23 Jun 2009 20:18:39 -0500 Message-ID: <4af5cd780906231818s3675a317r64187ad67cc33b32@mail.gmail.com> Subject: Does balancer ensure a file's replication is satisfied? From: Stuart White To: core-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org In my Hadoop cluster, I've had several drives fail lately (and they've been replaced). Each time a new empty drive is placed in the cluster, I run the balancer. I understand that the balancer will redistribute the load of file blocks across the nodes. My question is: will balancer also look at the desired replication of a file, and if the actual replication of a file is less than the desired (because the file had blocks stored on the lost drive), will balancer re-replicate those lost blocks? If not, is there another tool that will ensure the desired replication factor of files is satisfied? If this functionality doesn't exist, I'm concerned that I'm slowly, silently losing my files as I replace drives, and I may not even realize it. Thoughts?