hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Seigel Tynt <ja...@tynt.com>
Subject Re: Estimating Time required to compute M/Rjob
Date Mon, 18 Apr 2011 00:25:46 GMT
Yup. I'm boring 

On 2011-04-17, at 6:07 PM, Ted Dunning <tdunning@maprtech.com> wrote:

> Turing completion isn't the central question here, really.  The truth
> is, map-reduce programs have considerably pressure to be written in a
> scalable fashion which limits them to fairly simple behaviors that
> result in pretty linear dependence of run-time on input size for a
> given program.
> The cool thing about the paper that I linked to the other day is that
> there are enough cues about the expected runtime of the program
> available to make good predictions *without* looking at the details.
> No doubt the estimation facility could make good use of something as
> simple as the hash of the jar in question, but even without that it is
> possible to produce good estimates.
> I suppose that this means that all of us Hadoop programmers are really
> just kind of boring folk.  On average, anyway.
> On Sun, Apr 17, 2011 at 12:07 PM, Matthew Foley <mattf@yahoo-inc.com> wrote:
>> Since general M/R jobs vary over a huge (Turing problem equivalent!) range of behaviors,
a more tractable problem might be to characterize the descriptive parameters needed to answer
the question: "If the following problem P runs in T0 amount of time on a certain benchmark
platform B0, how long T1 will it take to run on a differently configured real-world platform
B1 ?"

View raw message