hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Poole, Samuel [USA]" <poole_sam...@bah.com>
Subject Hadoop for Independant Tasks not using Map/Reduce?
Date Wed, 19 Aug 2009 14:05:07 GMT
I am new to Hadoop (I have not yet installed/configured), and I want to make sure that I have
the correct tool for the job.  I do not "currently" have a need for the Map/Reduce functionality,
but I am interested in using Hadoop for task orchestration, task monitoring, etc. over numerous
nodes in a computing cluster.  Our primary programs (written in C++ and launched via shell
scripts) each run independantly on a single node, but are deployed to different nodes for
load balancing.  I want to task/initiate these processes on different nodes through a Java
program located on a central server.  I was hoping to use Hadoop as a foundation for this.

I read the following in the FAQ section:

"How do I use Hadoop Streaming to run an arbitrary set of (semi-)independent tasks?

Often you do not need the full power of Map Reduce, but only need to run multiple instances
of the same program - either on different parts of the data, or on the same data, but with
different parameters. You can use Hadoop Streaming to do this. "

So, two questions I guess.

1.  Can I use Hadoop for this purpose without using Map/Reduce functionality?

2.  Are there any examples available on how to implement this sort of configuration?

Any help would be greatly appreciated.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message