hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Agarwal, Nikhil" <Nikhil.Agar...@netapp.com>
Subject RE: Can I perfrom a MR on my local filesystem
Date Sun, 17 Feb 2013 11:18:38 GMT

Thank you Niels and thank you Nitin for your reply.

Actually, I want to run MR on a cloud store, which is open source. So I thought of implementing
 a file system for the same and plugging it into Hadoop, just like S3/KFS are there. This
would enable a hadoop client to talk to "My cloud store". But I do not have further clarity
as to how to run MR on the cloud using the JobTracker/TaskTracker framework of Hadoop.

As per the link given by Niels, it shows that I can run MR on local file system. So is there
any way of telling the JobTracker to read data from a set of nodes and then deploy TaskTracker
daemons on those nodes (which would be "My cloud store" in this case) and fetch the result
of MR.

Note: I do not want to fetch the data to my local computer as is the case with S3. Fetching
the data would fail the purpose of using Hadoop (which is moving compute to data).


From: Agarwal, Nikhil
Sent: Sunday, February 17, 2013 11:53 AM
To: 'user@hadoop.apache.org'
Subject: Can I perfrom a MR on my local filesystem


Recently I followed a blog to run Hadoop on a single node cluster.

I wanted to ask that in a single node set-up of Hadoop is it necessary to have the data copied
into Hadoop's HDFS before running a MR on it. Can I run MR on my local file system too without
copying the data to HDFS?

In the Hadoop source code I saw there are implementations of other file systems too like S3,
KFS, FTP, etc. so how does exactly a MR happen on S3 data store ? How does JobTracker or Tasktracker
run in S3 ?

I would be very thankful to get a reply to this.

Thanks & Regards,


View raw message