hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jason hadoop <jason.had...@gmail.com>
Subject Re: Complex workflows in Hadoop
Date Thu, 16 Apr 2009 15:33:23 GMT
Chaining described in chapter 8 of my book provides this to a limited
degree.

Cascading, http://www.cascading.org/, also supports complex flows. I do not
know how cascading works under the covers.

On Thu, Apr 16, 2009 at 8:23 AM, Shevek <hadoop@anarres.org> wrote:

> On Tue, 2009-04-14 at 07:59 -0500, Pankil Doshi wrote:
> > Hey,
> >
> > I am trying complex queries on hadoop and in which i require more than
> one
> > job to run to get final result..results of job one captures few joins of
> the
> > query and I want to pass those results as input to 2nd job and again do
> > processing so that I can get final results.queries are such that I cant
> do
> > all types of joins and filterin in job1 and so I require two jobs.
> >
> > right now I write results of job 1 to hdfs and read dem for job2..but
> thats
> > take unecessary IO time.So was looking for something that I can store my
> > results of job1 in memory and use them as input for job 2.
>
> Hi,
>
> I am a programming language and compiler designer. We have a workflow
> engine which is capable of taking a description of a complex workflow
> and analysing it as a multi-stage map-reduce system to generate an
> optimal resource allocation. I'm hunting around for people who have
> problems like this, since I'm considering whether to port the whole
> thing to hadoop as a high-level language.
>
> Do you, or any other users have descriptions of workflows more complex
> than "one map, maybe one reduce" which you would like to be able to
> express easily?
>
> S.
>
>


-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message