hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Sutter" <sut...@gmail.com>
Subject Re: Job scheduling (Re: Unable to run more than one job concurrently)
Date Fri, 19 May 2006 19:25:52 GMT
priority vs pretend dates: whatever is most trivial, i'm in favor. we could
certainly use it today.

delaying reducers: regardless of the bottleneck in the system, andrzej's
problem was that reducers were sitting idle while the mappers of another
task were running.

paul


On 5/19/06, Doug Cutting <cutting@apache.org> wrote:
>
> Paul Sutter wrote:
> > (1) Allow submission times in the future, enabling the creation of
> > "background" jobs. My understanding is that job submission times are
> used to
> > prioritize scheduling. All tasks from a job submitted early run to
> > completion before those of a job submitted later. If we could submit any
> > days-long jobs with a submission time in the future, say the year 2010,
> and
> > any short hours-long jobs with the current time, that short job would be
> > able to interrupt the long job. Hack? Yes. Useful? I think so.
>
> I think this is equivalent to adding a job priority, where tasks with
> the highest priority job are run first.  If jobs are at the same
> priority, then the first submitted would run.  Adding priority would add
> a bit more complexity, but would also be less of a hack.
>
> > (2) Have a per-job total task count limit. Currently, we establish the
> > number of tasks each node runs, and how many map or reduce tasks we have
> > total in a given job. But it would be great if we could set a ceiling on
> the
> > number of tasks that run concurrently for a given job. This may help
> with
> > Andrzej's fetcher (since it is bandwidth constrained, maybe fewer
> concurrent
> > jobs would be fine?).
>
> I like this idea.  So if the highest-priority job is already running at
> its task limit, then tasks can be run from the next highest-priority
> job.  Should there be separate limits for maps and reduces?
>
> > (3) Don't start the reducers until a certain number of mappers have
> > completed (25%? 75%? 90%?). This optimization of starting early will be
> less
> > important when we've solved the map output copy problems.
>
> Would this only be done if there were other jobs competing for reduce
> slots?  Otherwise this could have an impact on performance.
>
> In theory, network bandwidth is the primary MapReduce bottleneck.  Map
> input usually doesn't cross the network, but map output must generally
> cross the network to reduce nodes.  Sorting can be done in parallel with
> mapping and copying of map output to reduce nodes, but reduce output
> cannot begin until all map output has arrived.  Finally, in general,
> reduce output must cross the network, for replication.  So the data must
> cross the network twice: once between map and reduce and once on reduce
> output, and these two transfers are serialized.  Since the network is
> the bottleneck, we will get the best performance when we saturate it
> continuously.  So we should thus begin transferring map outputs as soon
> as they are available.  That's the theory.
>
> Doug
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message