hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan A. P. Pendleton" ...@geekdom.net>
Subject What's wrong with speculative execution, again?
Date Thu, 11 Jan 2007 19:57:30 GMT
I know the default was changed to "off" because of some bug. What's the
nature of the problem?

I ran a job last night that held for a long time because a job somehow got
assigned to a tasktracker that wasn't taking tasks - the task stayed as
"UNASSIGNED" in status indefinitely - I eventually killed the tasktracker,
which let the total job finish. Had speculative execution been going,
there'd've been no problem here. Not sure if this is a new bug, or somehow
related to the core speculative execution bug, but, it'd also be nice to
have speculative execution turned back on, as it really does drop the
turnaround time on jobs.

I'm now regularly running jobs that occupy ~100 CPUs for a half day or so,
and the lack of speculative execution plus the occasional wacky machine
causes the turnaround on these jobs to go up by large fractions of the total
job time, so I'd love to see this problem go (back) away.

-- 
Bryan A. P. Pendleton
Ph: (877) geek-1-bp

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message