hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vaibhav J" <vaibh...@rediff.co.in>
Subject Problem : data distribution is non uniform between two different disks on datanode.
Date Mon, 16 Mar 2009 12:19:42 GMT
 

 

  _____  

From: Vaibhav J [mailto:vaibhavj@rediff.co.in] 
Sent: Monday, March 16, 2009 5:46 PM
To: 'nutch-dev@lucene.apache.org'; 'nutch-user@lucene.apache.org'
Subject: Problem : data distribution is non uniform between two different
disks on datanode.

 

 

 

 

We have 27 datanode and replication factor is 1. (data size is ~6.75 TB)

We have specified two different disks for dfs data directory on each
datanode by using 

property dfs.data.dir in hadoop-site.xml file of conf directory.

(value of property dfs.data.dir : /mnt/hadoop-dfs/data,
/mnt2/hadoop-dfs/data)

 

when we are setting replication factor 2 then data distribution is biased to
first disk, 

more data is coping on /mnt/hadoop-dfs/data and after copying some
data...first disk becomes full 

and showing no available space on disk while we have enough space on second
disk (/mnt2/hadoop-dfs/data ). 

so, it is difficult to achieve replication factor 2.

 

Data traffic is coming on second disk also (/mnt2/hadoop-dfs/data) but it
looks that

more data is copied on fisrt disk (/mnt/hadoop-dfs/data).

 

 

What should we do to get uniform data distribution between two different
disks on 

each datanode to achieve replication factor 2?

 

 

Regards

Vaibhav J.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message