hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C G <parallel...@yahoo.com>
Subject HDFS tool and replication questions...
Date Mon, 10 Dec 2007 19:58:49 GMT
Hi All:
  Is there a tool available that will provide information about how a file is replicated within
HDFS?  I'm looking for something that will "prove" that a file is replicated across multiple
nodes, and let me see how many nodes participated, etc.  This is a point of interest technically,
but more importantly a point of due diligence around data security and integrity accountability.

  Also, are there any metrics or best practices around what the replication factor should
be based on the number of nodes in the grid?  Does HDFS attempt to involve all nodes in the
grid in replication?  In other words, if I have 100 nodes in my grid, and a replication factor
of 6, will all 100 nodes wind up storing data for a given file assuming the file large enough?
  C G

Looking for last minute shopping deals?  Find them fast with Yahoo! Search.
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message