hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yi Zhao <yi.z...@alibaba-inc.com>
Subject how to distribute the data to all the datanodes?
Date Mon, 14 Jul 2008 10:12:12 GMT
hi, all
I have a hadoop cluster which have one master and three datanodes.

I want to put a local file about 128M intpu hdfs, I have set the
block-size to 10M

when I set the replication to 0,
I found that all the data distributed to the node which I execute the
command 'bin/hadoop dfs -put file.gz input', so this node's disk space
is used about 128M, but other nodes has no disk space used.

when I set the replication to 3,
I found that every nodes have the same data, so every nodes is about
128M disk space used.

what should I do? I'm using hadoop-0.15.2.

any one can help me?

thanks.


Mime
View raw message