hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stu Hood <stuh...@webmail.us>
Subject RE: Replication problem of HDFS
Date Fri, 07 Sep 2007 23:03:22 GMT


>So, the upload process(from local file system to HDFS) will store all
>blocks(split from the dataset, 
>said M split blocks) into a single node(depend on which client you put), not
>to all datanodes. It will store the blocks in 'replication' datanodes. If "replication
== 2" then it will make sure that 2 copies of each of the M blocks exist on datanodes.


>And the "replication" means to replicate to N clients(if replication=N) and
>each client owns
>a completed/all M blocks.No, it means to replicate the file to N datanodes. The client
is only used to transfer files to/from Hadoop: it doesn't do any long term storage.

Thanks,
Stu


-----Original Message-----
From: ChaoChun Liang 
Sent: Thursday, September 6, 2007 10:23pm
To: hadoop-user@lucene.apache.org
Subject: RE: Replication problem of HDFS


So, the upload process(from local file system to HDFS) will store all
blocks(split from the dataset, 
said M split blocks) into a single node(depend on which client you put), not
to all datanodes. 
And the "replication" means to replicate to N clients(if replication=N) and
each client owns
a completed/all M blocks. If I am wrong, please correct it. Thanks.

ChaoChun


Stu Hood-2 wrote:
> 
> ChaoChun,
> 
> Since you set the 'replication = 1' for the file, only 1 copy of the
> file's blocks will be stored in Hadoop. If you want all 5 machines to have
> copies of each block, then you would set 'replication = 5' for the file.
> 
> The default for replication is 3.
> 
> Thanks,
> Stu
> 
> 

-- 
View this message in context: http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12534839
Sent from the Hadoop Users mailing list archive at Nabble.com.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message