hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Null Ecksor <nulleck...@gmail.com>
Subject Hadoop file distribution
Date Sat, 10 Apr 2010 03:39:52 GMT
Hey guys,

I am new user of hadoop.
I am writing a mapreduce query on a relatively huge file (3 Gb). First I had
a single node hadoop installed which took approx 200 seconds.
Now I installed hadoop cluster on 10 machines and tried to use the same
query. It took nearly 230 seconds this time.

The query Im using to insert data into hdfs is -

hadoop dfs -put *.dat /data/

How to check weather the file is distributed among the 10 machines? And how
to distribute the file amongst the datanodes to make it faster?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message