hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stu Hood <stuh...@webmail.us>
Subject RE: Replication problem of HDFS
Date Thu, 06 Sep 2007 04:45:28 GMT
ChaoChun,

Since you set the 'replication = 1' for the file, only 1 copy of the file's blocks will be
stored in Hadoop. If you want all 5 machines to have copies of each block, then you would
set 'replication = 5' for the file.

The default for replication is 3.

Thanks,
Stu



-----Original Message-----
From: ChaoChun Liang 
Sent: Wednesday, September 5, 2007 9:26pm
To: hadoop-user@lucene.apache.org
Subject: RE: Replication problem of HDFS


Yes, you are right. the namenode and datanode are in the same machine
and upload data into HDFS in the same one in my environment. I suppose 
the HDFS will distribute these blocks to all others datanode(according the 
HDFS reference), but it is not actually. 

>>Inthis case, the only replica of the file will reside on the Datanode that
is
>>local to the client.
So, does it conflict with the HDFS reference? (a file in the HDFS will be
split into 
one or more blocks and these blocks are stored in a set of Datanodes. )

What kind of uploading to let all data/files store into the datanodes(not a
single one)?

ChaoChun



Dhruba Borthakur wrote:
> 
> Hi ChaoChun,
> 
> I do not fully understand your problem. I am guessing that you are running
> a
> Datanode on the same machine as the Namenode. I am also guessing that you
> are using the Namenode machine as a client to upload a file into HDFS. In
> this case, the only replica of the file will reside on the Datanode that
> is
> local to the client.
> 
> Thanks,
> dhruba
> 
> -----Original Message-----
> From: ChaoChun Liang [mailto:ccliangnn@gmail.com] 
> Sent: Wednesday, September 05, 2007 1:58 AM
> To: hadoop-user@lucene.apache.org
> Subject: Replication problem of HDFS
> 
> 
> According the reference of
> HDFS(http://lucene.apache.org/hadoop/hdfs_design.html),
> a file in the HDFS will be split into one or more blocks and these blocks
> are stored in 
> a set of Datanodes. 
> 
> I put(set replication=1) a 2GB data set to a 5-nodes cluster, but found
> only
> the 
> namenode increase the block numbers, others nodes keep the same value. It
> means
> all blocks copied to the namenode, none to datanodes. Is it correct?
> 
> ChaoChun
> -- 
> View this message in context:
> http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12494269
> Sent from the Hadoop Users mailing list archive at Nabble.com.
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12514090
Sent from the Hadoop Users mailing list archive at Nabble.com.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message