hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ranjit Mathew (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2138) Gridmix tests with different time interval mr traces (1min, 3min and 5min).
Date Tue, 23 Nov 2010 08:19:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934774#action_12934774
] 

Ranjit Mathew commented on MAPREDUCE-2138:
------------------------------------------

In this code-fragment from {{GridmixJobStory.java}}:
{code}
private Map<JobID,JobStory> jobstories;
private Map<JobID,ZombieJob> zombieJobs;

[...]

zombieJobs = buildJobStories();
jobstories = new HashMap<JobID,JobStory>();
Set<JobID> keys = zombieJobs.keySet();
Iterator <JobID> ite = keys.iterator();
while (ite.hasNext()) {
  JobID jobId = ite.next();
  jobstories.put(jobId, zombieJobs.get(jobId));
}
{code}
{{jobstories}} looks like the _same_ map as {{zombieJobs}}, as far as I can tell, and therefore
redundant. (I had the same comment for the previous version of this patch.)

Other comments:
# {{TestGridmixWith1minTrace}}, {{TestGridmixWith3minTrace}} and {{TestGridmixWith5minTrace}}
share a _lot_ of code and should be combined into a single class with different test-cases.
# {{GridmixJobVerification.convertToSecs()}} seems to have a bug - it should divide by 10^9
_not_ 10^10 when converting from nano-seconds to seconds.
# For a neat version of {{GridmixJobVerification.convertBytes()}}, check out [aioobe's answer
on Stack Overflow|http://stackoverflow.com/questions/3758606/how-to-convert-byte-size-into-human-readable-format-in-java/3758880#3758880].
# What does ??OVERALL?? mean as a job-status in {{GridmixJobVerification.convertJobStatus()}}?
# In {{GridmixJobVerification.getCounterValue()}}, shouldn't you be using the actual name
of the counter rather than the display-name? The display-name is liable to change according
to the whim of the developers. (Of course, the callers will also have to change accordingly.)

> Gridmix tests with different time interval mr traces (1min, 3min and 5min).
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2138
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2138
>             Project: Hadoop Map/Reduce
>          Issue Type: Task
>          Components: test
>            Reporter: Vinay Kumar Thota
>            Assignee: Vinay Kumar Thota
>         Attachments: MAPREDUCE-2138.patch, MAPREDUCE-2138.patch
>
>
> 1. Generate input data based on cluster size and create the synthetic jobs by using the
1 min folded MR trace and
> submit the jobs with below arguments.
> GRIDMIX_JOB_TYPE = LoadJob
> GRIDMIX_USER_RESOLVER = SubmitterUserResolver
> GRIDMIX_SUBMISSION_POLICY = STRESS
> Input Size = 400 MB * No. of nodes in cluster.
> TRACE_FILE = 1 min folded trace.
> Verify each job status and summary(QueueName, UserName, StatTime, FinishTime, maps, reducers
and counters etc) after
> completion of execution.
> 2. Generate input data based on cluster size and create the synthetic jobs by using the
3 min folded MR trace and
> submit the jobs with below arguments.
> GRIDMIX_JOB_TYPE = LoadJob
> GRIDMIX_USER_RESOLVER = RoundRobinUserResolver
> GRIDMIX_SUBMISSION_POLICY = Replay
> Input Size = 200 MB * No. of nodes in cluster.
> TRACE_FILE = 3 min folded trace.
> PROXY_USERS = proxy users file path.
> Verify each job status, submitted user and summary(QueueName, UserName, StatTime, FinishTime,
maps, reducers and
> counters etc) after completion of execution.
> 3. Generate input data based on cluster size and create the synthetic jobs by using the
5 min folded MR trace and
> submit the jobs with below arguments.
> GRIDMIX_JOB_TYPE = SleepJob
> GRIDMIX_USER_RESOLVER = EchoUserResolver
> GRIDMIX_MIN_FILE = 100 MB
> GRIDMIX_SUBMISSION_POLICY = Serial
> Input Size = 300 MB * No. of nodes in cluster.
> TRACE_FILE = 5 min folded trace.
> Verify each job status, file size and summary(QueueName, UserName, StatTime, FinishTime,
maps, reducers and counters
> etc) after completion of execution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message