hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Replication
Date Wed, 31 Oct 2012 05:20:32 GMT
Hi,

Yes if you are purely a regular client (non DN box) writing to HDFS,
then the chosen DNs are selected at random (but fit within policy of
cross-rack writes, if it applies to your environment).

On Wed, Oct 31, 2012 at 6:43 AM, Mohit Anchlia <mohitanchlia@gmail.com> wrote:
> Thanks and if it is not the datanode then I am guessing namenode decides the
> nodes in replication pipeline?
>
>
> On Tue, Oct 30, 2012 at 5:36 PM, ranjith raghunath
> <ranjith.raghunath1@gmail.com> wrote:
>>
>> If your client node is a datanode with your cluster then the first copy
>> does get written to that data node.
>>
>> Experts please feel free to correct me here.
>>
>> On Oct 30, 2012 7:11 PM, "Mohit Anchlia" <mohitanchlia@gmail.com> wrote:
>>>
>>> With respect to replication if I run pig job from one of the nodes within
>>> the Hadoop cluster then do I always end up with writing 1 replica copy to
>>> that client node always and remaining 2 replica copies to other nodes?
>>>
>
>



-- 
Harsh J

Mime
View raw message