hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hong Tang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1229) [Mumak] Allow customization of job submission policy
Date Tue, 01 Dec 2009 04:20:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784046#action_12784046

Hong Tang commented on MAPREDUCE-1229:

Attached new patch that addresses the comments by Dick.

bq. 1: Should TestSimulator*JobSubmission check to see whether the total "runtime" was reasonable
for the Policy?
Currently, each policy is tested as a separate test case. It may be hard to combine them and
compare the virtual runtime, which is only present as console output. I did do some basic
sanity check manually after the run.

bq. 2: minor nit: Should SimulatorJobSubmissionPolicy/getPolicy(Configuration) use valueOf(policy.toUpper())
instead of looping through the types?
Updated in the patch based on the suggestion.

bq. 3: medium sized nit: in SimulatorJobClient.isOverloaded() there are two literals, 0.9
and 2.0F, that ought to be static private named values.
Added final variables to represent the magic constants, and added comments.

bq. 4: Here is my biggest point. The existing code cannot submit a job more often than once
every five seconds when the jobs were spaced further apart than that and the policy is STRESS
bq. Please consider adding code to call the processLoadProbingEvent core code when we processJobCompleteEvent
or a processJobSubmitEvent . That includes potentially adding a new LoadProbingEvent . This
can lead to an accumulation because each LoadProbingEvent replaces itself, so we should track
the ones that are in flight in a PriorityQueue and only add a new LoadProbingEvent whenever
the new event has a time stamp strictly earlier than the earliest one already in flight. This
will limit us to two events in flight with the current adjustLoadProbingInterval .
bq. If you don't do that, then if a real dreadnaught of a job gets dropped into the system
and the probing interval gets long it could take us a while to notice that we're okay to submit
jobs, in the case where the job has many tasks finishing at about the same time, and we could
submit tiny jobs as onsies every five seconds when the cluster is clear enough to accommodate
lots of jobs. When the cluster can handle N jobs in less than 5N seconds for some N, we won't
overload it with the existing code.
I changed the minimum load probing interval to 1 seconds (from 5 seconds). Note that when
a job is submitted, it could take a few seconds before JT assigns the map tasks to TTs with
free map slots. So reducing this interval further could lead to artificial load spikes.

I also added load checks after each job completion, and if the cluster is underloaded, we
submit another job (and reset the load checking interval to the minimum value). This does
bring in a potential danger when many jobs happen to complete at the same time, and inject
a lot of jobs into the system. But I think such risk should be fairly low and thus would not
worry much about it.

> [Mumak] Allow customization of job submission policy
> ----------------------------------------------------
>                 Key: MAPREDUCE-1229
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1229
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/mumak
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Hong Tang
>            Assignee: Hong Tang
>             Fix For: 0.21.0, 0.22.0
>         Attachments: mapreduce-1229-20091121.patch, mapreduce-1229-20091123.patch, mapreduce-1229-20091130.patch
> Currently, mumak replay job submission faithfully. To make mumak useful for evaluation
purposes, it would be great if we can support other job submission policies such as sequential
job submission, or stress job submission.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message