hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-777) A method for finding and tracking jobs from the new API
Date Wed, 16 Sep 2009 10:34:58 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755973#action_12755973
] 

Tom White commented on MAPREDUCE-777:
-------------------------------------

* The number of classes in the old org.apache.hadoop.mapred package is very large and daunting
for users of MapReduce. We should only add classes to the new org.apache.hadoop.mapreduce
if they are a part of the core public API for MapReduce. Internal classes with public visibility
belong in another package. On this basis I would suggest moving (by analogy with HDFS packaging)
** CLI to org.apache.hadoop.mapreduce.tools
** ClientProtocol to org.apache.hadoop.mapreduce.protocol
* The Job constructors should be changed to be static factory methods to make Job submission
more flexible in future (see https://issues.apache.org/jira/browse/MAPREDUCE-777?focusedCommentId=12746014&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12746014)
* Some of the value object classes have setters even though their state should only be read
by user code. These are: JobStatus, QueueAclsInfo, QueueInfo, TaskCompletionEvent, TaskReport.
These should be made immutable, or have package-private or protected setters.
* Cluster has a mixture of methods that return arrays and those that return collections. Can
we change them all to be consistent (preferably collections)?
* Rename Cluster#getTTExpiryInterval() to the more readable getTaskTrackerExpiryInterval().
* ClusterMetrics#getDeccommisionTrackers() is misspelled and should be getDecommissionedTaskTrackers().
Similarly change the instance variable numDecommisionedTrackers to numDecommissionedTrackers
(double 's'). In fact, the get*TaskTrackers() methods would be better called get*TaskTrackerCount()
since they don't return tasktracker objects, but a count of those objects.
* CLI's usage string refers to JobClient.
* JobStatus's javadoc refers to JobProfile, which is in the mapred package so we probably
don't want to refer to it.
* All public classes need javadoc to explain their role. 

> A method for finding and tracking jobs from the new API
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-777
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-777
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: client
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: m-777.patch, patch-777-1.txt, patch-777-2.txt, patch-777-3.txt,
patch-777-4.txt, patch-777-5.txt, patch-777-6.txt, patch-777-7.txt, patch-777-8.txt, patch-777-9.txt,
patch-777.txt
>
>
> We need to create a replacement interface for the JobClient API in the new interface.
In particular, the user needs to be able to query and track jobs that were launched by other
processes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message