hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hong Tang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix
Date Mon, 05 Apr 2010 05:21:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853309#action_12853309
] 

Hong Tang commented on MAPREDUCE-1594:
--------------------------------------

- GenerateData should extend from GridmixJob instead of LoadJob. I think we can have a default
implementation of buildSplits (as an empty function) in GridmixJob and remove the "abstract"
keyword.
- The indentation of JobCreator looks weird.
- Replace the following with Configuration.getEnum:
{noformat}
+    Configuration conf, JobCreator defaultPolicy) {
+    String policy = conf.get(GRIDMIX_JOB_TYPE, defaultPolicy.name());
+    return valueOf(policy.toUpperCase());
{noformat}
- unused and extra imports (CommonConfigurationKeys, FsPermission) in TestGridmixSubmission.java.
- There does not seem be any unit tests covering the added feature.
- Avoiding directly setting "gridmix.job.seq" in both LoadJob and SleepJob. Instead, refactor
the statement to a common method in GridmixJob called setSeqId(Job job). Similarly, adding
a method getSeqId(Job job) in GridmixJob and avoid directly calling conf.get("girdmix.job.seq",
-1) in {GridmixInputFormat, SleepInputFormat}.getSplits(...).
- Should we rename Gridmix{Mapper, Reducer, InputFormat, RecordReader, Split} to Load{Mapper,
Reducer, InputFormat, RecordReader, Split}?
- Also suggest to refactor the statement of setting the original job id to a method in GridmixJob.
- There are some comments in LoadJob that are hard to understand.
{noformat}
  // TODO replace with ThreadLocal submitter?

  // not nesc when TL
{noformat}
- reduces in SleepMapper is not used.
- SleepJob hacks GridmixKey to pass along the sleep duration from map tasks to reduce tasks.
We should document that in the code and file a jira to fix it.
- The following code does not seem to do what the comments claim to:
{noformat}
      //This is to stop accumulating deviation from expected sleep time
      //over a period of time.
      long now = System.nanoTime() / 1000000;
      if (now < duration) {
        duration = duration - now;
      }
      long slept = 0L;
      long sleep = 0L;
      while (slept < duration) {
        final long rem = duration - slept;
        sleep = Math.min(rem, RINTERVAL);
        context.setStatus("Sleeping... " + rem + " ms left");
        slept += sleep;
        TimeUnit.MILLISECONDS.sleep(sleep);
      }
{noformat}

Should it be more like the following?
{noformat}
      //This is to stop accumulating deviation from expected sleep time
      //over a period of time.
      long start = System.nanoTime() / 1000000;
      long slept = 0L;
      long sleep = 0L;
      while (slept < duration) {
        final long rem = duration - slept;
        sleep = Math.min(rem, RINTERVAL);
        context.setStatus("Sleeping... " + rem + " ms left");
        TimeUnit.MILLISECONDS.sleep(sleep);
        slept = System.nanoTime() / 1000000 - start;
      }
{noformat}

- The following seems incorrect:
{noformat}
      final long RINTERVAL = TimeUnit.MILLISECONDS.convert(
        context.getConfiguration().getLong(
          GRIDMIX_SLEEP_INTERVAL, Math.min(
            1, duration / 20)), TimeUnit.SECONDS);
{noformat}
"duration" is in ms not in seconds. It should be changed to
{noformat}
      final long RINTERVAL = TimeUnit.MILLISECONDS.convert(
        context.getConfiguration().getLong(
          GRIDMIX_SLEEP_INTERVAL, Math.min(
            1, Math.max(1, duration/1000/20))), TimeUnit.SECONDS);
{noformat}
- I cannot find anywhere outdir is used by SleepJob. Did you encounter an error if FOF.setOutputPath
is commented out in SleepJob.call()? 
- Both SleepJob and GridmixJob calls FileInputFormat.addInputPath(job, new path("ignored")),
but one is surrounded with a try-catch block and the other is not. Not sure why. I am also
curious to know what would be the error if FIF.addInputPath is not called in both classes.


> Support for Sleep Jobs in gridmix
> ---------------------------------
>
>                 Key: MAPREDUCE-1594
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>            Reporter: rahul k singh
>         Attachments: 1376-5-yhadoop20-100-3.patch, 1594-yhadoop-20-1xx-1-2.patch, 1594-yhadoop-20-1xx-1-3.patch,
1594-yhadoop-20-1xx-1.patch, 1594-yhadoop-20-1xx.patch
>
>
> Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message