hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Agarwal, Nikhil" <Nikhil.Agar...@netapp.com>
Subject RE: How to add another file system in Hadoop
Date Fri, 22 Feb 2013 05:05:17 GMT
Hi All,

Thanks a lot for taking out your time to answer my question.

Ling, thank you for directing me to glusterfs. I can surely take lot of help from that but
what I wanted to know is that in README.txt it is mentioned :

>> # ./bin/start-mapred.sh
  If the map/reduce job/task trackers are up, all I/O will be done to GlusterFS.

So, suppose my input files are scattered in different nodes(glusterfs servers), how do I(hadoop
client having glusterfs plugged in) issue a Mapreduce command?
Moreover, after issuing a Mapreduce command would my hadoop client fetch all the data from
different servers to my local machine and then do a Mapreduce or would it start the TaskTracker
daemons on the machine(s) where the input file(s) are located and perform a Mapreduce there?
Please rectify me if I am wrong but I suppose that the location of input files top Mapreduce
is being returned by the function getFileBlockLocations (FileStatus file, long start, long

Thank you very much for your time and helping me out.


From: Agarwal, Nikhil
Sent: Thursday, February 21, 2013 4:19 PM
To: 'user@hadoop.apache.org'
Subject: How to add another file system in Hadoop


I am planning to add a file system called CDMI under org.apache.hadoop.fs in Hadoop, something
similar to KFS or S3 which are already there under org.apache.hadoop.fs. I wanted to ask that
say, I write my file system for CDMI and add the package under fs but then how do I tell the
core-site.xml or other configuration files to use CDMI file system. Where all do I need to
make changes to enable CDMI file system become a part of Hadoop ?

Thanks a lot in advance.


View raw message