Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 66746 invoked from network); 10 Dec 2007 19:59:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Dec 2007 19:59:24 -0000 Received: (qmail 9949 invoked by uid 500); 10 Dec 2007 19:59:11 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 9645 invoked by uid 500); 10 Dec 2007 19:59:10 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 9636 invoked by uid 99); 10 Dec 2007 19:59:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Dec 2007 11:59:10 -0800 X-ASF-Spam-Status: No, hits=3.2 required=10.0 tests=FS_REPLICA,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [68.142.237.93] (HELO n8.bullet.re3.yahoo.com) (68.142.237.93) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 10 Dec 2007 19:58:47 +0000 Received: from [68.142.230.29] by n8.bullet.re3.yahoo.com with NNFMP; 10 Dec 2007 19:58:50 -0000 Received: from [216.252.122.217] by t2.bullet.re2.yahoo.com with NNFMP; 10 Dec 2007 19:58:49 -0000 Received: from [69.147.65.182] by t2.bullet.sp1.yahoo.com with NNFMP; 10 Dec 2007 19:58:49 -0000 Received: from [127.0.0.1] by omp301.mail.sp1.yahoo.com with NNFMP; 10 Dec 2007 19:58:49 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 795663.95410.bm@omp301.mail.sp1.yahoo.com Received: (qmail 18573 invoked by uid 60001); 10 Dec 2007 19:58:49 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=XnJ56r/hGPIOSZOUzS0g0OvhCIamBV02WnBFOA9AIvzwE6Jw5OqXtJGmuYHLWut3I0E1J4+1JwO8yJi5Et+zlWuPzt4Zf96WMAhSLcxKOavKk3sNx5OgnAUnqnKwvvHVOkLTO2iNZebenmCwrtwi/od6wweHKjFQjJAuPDVwG7g=; X-YMail-OSG: tHwONGAVM1mLtUmQU2TlT.0okaVAHfcc1xgmiXwgTCMvzWx6meAhVo95GYt6LrKZwPu__2Ef_0bjqSiVvqlwZlrBMU1ctQnXaX.T Received: from [65.204.46.50] by web45406.mail.sp1.yahoo.com via HTTP; Mon, 10 Dec 2007 11:58:49 PST Date: Mon, 10 Dec 2007 11:58:49 -0800 (PST) From: C G Subject: HDFS tool and replication questions... To: hadoop-user@lucene.apache.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-2128194045-1197316729=:18294" Content-Transfer-Encoding: 8bit Message-ID: <639206.18294.qm@web45406.mail.sp1.yahoo.com> X-Virus-Checked: Checked by ClamAV on apache.org --0-2128194045-1197316729=:18294 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Hi All: Is there a tool available that will provide information about how a file is replicated within HDFS? I'm looking for something that will "prove" that a file is replicated across multiple nodes, and let me see how many nodes participated, etc. This is a point of interest technically, but more importantly a point of due diligence around data security and integrity accountability. Also, are there any metrics or best practices around what the replication factor should be based on the number of nodes in the grid? Does HDFS attempt to involve all nodes in the grid in replication? In other words, if I have 100 nodes in my grid, and a replication factor of 6, will all 100 nodes wind up storing data for a given file assuming the file large enough? Thanks, C G --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. --0-2128194045-1197316729=:18294--