hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1408) Allow customization of job submission policies
Date Sat, 13 Feb 2010 10:01:28 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833354#action_12833354
] 

Chris Douglas commented on MAPREDUCE-1408:
------------------------------------------

* All the fields in {{StatsCollector}} can be final
* Jobs pushed in {{StatsCollector::add}} do not need to be retained in a list, since now only
its size is used. For the current purpose, it is sufficient to keep only a counter
* Instead of fixing {{MAX_JOBS_COMPLETED_IN_POLL_INTERVAL}} to 1, it would make sense to let
this parameter be configurable. {{SERIAL}} submission should set it to 1 but {{STRESS}} can
be less aggressive by default. It would make sense to let this be defined by the enum, as
the polling interval is in the current patch
* Excluding {{SERIAL}} where the max jobs per interval is implicit, letting the {{STRESS}}
and {{REPLAY}} define defaults for configurable parameters rather than fixed values may make
sense (for both max jobs/interval and the interval period)
* The default of 1000ms for {{STRESS}} is too low
* Setting the interrupt flag is unnecessary if that is the condition for shutting down the
thread- as in {{StatsCollector.Collector::run}} and {{JobFactory::run}}. The only reason to
reset the flag is if the current context does not know how to handle the interrupt, but its
caller might and it cannot throw {{InterruptedException}}. Returning from {{run}} is correct
in this case.
* The {{StatListener}} interface, the new {{JobFactory}} subclasses, and {{StatsCollector}}
should be package-private. If {{JobFactory}} is made an abstract base, it may make sense to
make it public
* {{StatsCollector::join}} should respect the timeout provided, not wait forever
* This block in {{StressReaderThread::run}}:
{noformat}
+          synchronized (lock) {
+            while (loadStatus.isOverloaded) {
+              //Wait while JT is overloaded.
+              try {
+                lock.wait();
+              } catch (InterruptedException ie) {
+                Thread.currentThread().interrupt();
+              }
+            }
+          }
{noformat}
Does not seem correct. If interrupted while waiting, {{InterruptedException}} will be thrown,
caught, the flag reset, causing another exception to be thrown. If interrupted, the thread
should exit. A similar pattern is in {{StatsCollector.Collector::run}}
* As with {{SerialJobFactory}}, {{StressJobFactory}} should be submitting jobs to be run immediately,
not at the interval recorded in the trace. The submission time in the trace can be disregarded
entirely.
* {{StatsCollector::addClusterStatsObservers}} has no callers; how does one subscribe to the
statistics updates from the JT?
* {{LoadCalculator::checkLoadAndGetSlotsAvailable}} should use commons logging at debug level,
rather than {{System.out}}. Similarly, many of these statements are non-trivial to construct
and should be guarded by {{LOG.isDebugEnabled()}}
* Log message such as: {{LOG.error(ie.getCause(), ie)}} Should instead include a message and
the exception (which will also include its cause)
* The {{LoadCalculator}} methods can probably be part of {{StressJobFactory}} rather than
static methods in a separate class

> Allow customization of job submission policies
> ----------------------------------------------
>
>                 Key: MAPREDUCE-1408
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1408
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: rahul k singh
>         Attachments: 1408-1.patch
>
>
> Currently, gridmix3 replay job submission faithfully. For evaluation purposes, it would
be great if we can support other job submission policies such as sequential job submission,
or stress job submission.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message