hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bit1129@163.com" <bit1...@163.com>
Subject Question about the behavior of HDFS.
Date Fri, 19 Dec 2014 03:48:43 GMT
Hi Hadoopers´╝î

I got a question about the behavior of HDFS.

Say, there are 1 namenode and 10 data nodes. 

On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly
 to the data nodes, and there is no data stored on the namenode? 
If I upload the the data from the data node, will the file still distributed evenly to all
the data nodes ? I think if most of the data reside on the node that i upload the data, it
will save the network, but this leads to another problem, when MR this file, 
most of time will be spent on this node because it has to process most of the data. 

View raw message