hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raj V <rajv...@yahoo.com>
Subject hdfs space problem.
Date Thu, 05 Aug 2010 15:33:26 GMT


I run a 512 node hadoop cluster. Yesterday I moved 30Gb of compressed data from 
a NFS mounted partition by running  on the namenode

hadoop fs -copyFromLocal  /mnt/data/data1 /mnt/data/data2 mnt/data/data3 
hdfs:/data

When the job completed the local disk on the namenode was 40% full ( Most of it 
used by the dfs dierctories) while the others had 1% disk utilization.

Just to see if there was an issue, I deleted the hdfs:/data directory and 
restarted the move from a datanode. 

Once again the disk space on that data node was substantially over utilized.

I would have assumed that the disk space would be more or less uniformly 
consumed on all the data nodes.

Is there a reason why one disk would be over utilized? 

Do I have to run balancer everytime I copy data?

Am I missing something?

Raj
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message