hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Finkelshteyn <iefin...@gmail.com>
Subject HDFS Files Seem to be Stored in the Wrong Location?
Date Mon, 06 Feb 2012 16:21:45 GMT
I have a pseudo-distributed Hadoop cluster setup, and I'm currently 
hoping to put about 100 gigs of files on it to play around with. I got a 
unix box at work no one else is using for this, and running a df -h, I get:
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             7.9G  2.4G  5.2G  31% /
none                  3.8G     0  3.8G   0% /dev/shm
/dev/sdb              414G  210M  393G   1% /mnt

Alright, so /mnt looks quite big and seems like a good place to store my 
hdfs files. I go ahead and create a folder named hadoop-data there and 
set the following in hdfs-site.xml:

<!-- where hadoop stores its files (datanodes only) -->

After a bit of troubleshooting, I restart the cluster and try to put a 
couple of test files onto HDFS. Doing an ls of hadoop-data, I see:

$ ls
current  image  in_use.lock  previous.checkpoint

OK, things look good. Time to try uploading some real data. Now, here's 
where the problem arises. If I add a 10mb dummy file to hadoop-data 
through regular unix and run df -h, I see that the used space of /mnt 
goes up exactly 10mb. But, when I start running a big dump of data through:

hadoop fs -put ~/hadoop_playground/data2/data2/ /data/

I notice that running df -h seems to put the data in completely the 
wrong location! Note that below, only the usage of /dev/sda1 has 
increased. /mnt has not moved.

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             7.9G  3.4G  4.2G  45% /
none                  3.8G     0  3.8G   0% /dev/shm
/dev/sdb              414G  210M  393G   1% /mnt

So, what gives? Anyone have any clue how my files are seemingly both put 
in the hadoop-data folder, but take up space elsewhere? I could see this 
likely being a Unix issue, but I figured I'd ask here just in case it's 
not, since I'm pretty stumped.


View raw message