Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-user@lucene.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
Date: Thu, 20 Dec 2007 18:46:58 -0800 (PST)
From: Kirk True <kirk@mustardgrain.com>
Subject: Appropriate use of Hadoop for non-map/reduce tasks?
To: hadoop-user@lucene.apache.org
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="0-501118047-1198205218=:74734"
Content-Transfer-Encoding: 8bit
Message-ID: <375870.74734.qm@web806.biz.mail.mud.yahoo.com>

--0-501118047-1198205218=:74734
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

Hi all,

A lot of the ideas I have for incorporating Hadoop into internal projects revolves around distributing long-running tasks over multiple machines. I've been able to get a quick prototype up in Hadoop for one of those projects and it seems to work pretty well. 

However, in this project and the others, I'm not processing a lot of text or mapping or reducing anything. I'm basically asynchronously processing a lot of work over many machines in a master/worker paradigm rather than map/reduce.

I have shown that I can achieve what I'm looking for with Hadoop. I just can't get over the "feeling" that I'm shoe-horning it into a use it wasn't really meant to do.

We've done a similar project with Gigaspaces, but Hadoop seems to alleviate a lot of the burden of what we're doing moving forward.

Thoughts?

Thanks!
Kirk

--0-501118047-1198205218=:74734--