hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (Naga)" <garlanaganarasi...@huawei.com>
Subject RE: How do I customize data placement on DataNodes (DN) of Hadoop cluster?
Date Wed, 28 Oct 2015 10:34:02 GMT
Hi Praveen and Salil,

If the data is being written from one of the cluster nodes then preference would be given
for local node irrespective of the Rack being configured.
If its written remotely(not from one of cluster nodes) then there is possibility of blocks
getting distributed.
Further you can think of having some custom BlockPlacementPolicy by extending BlockPlacementPolicydefault
and configuring "dfs.block.replicator.classname" if required.

+ Naga

________________________________
From: praveen S [mylogin13@gmail.com]
Sent: Tuesday, October 27, 2015 17:44
To: user@hadoop.apache.org
Subject: Re: How do I customize data placement on DataNodes (DN) of Hadoop cluster?


May be Using rack concept might work

On 27 Oct 2015 17:32, "Norah Jones" <nh.jones01@gmail.com<mailto:nh.jones01@gmail.com>>
wrote:
Hi,

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster
consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following.
DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB)
each.

Can it be possible? How to accomplish it?

Thanks,
Salil


Mime
View raw message