hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1313) JobInProgress should be public (or implement a public interface)
Date Wed, 02 May 2007 16:58:33 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493146
] 

Doug Cutting commented on HADOOP-1313:
--------------------------------------

> So the only thing I care about is the actual number of running/completed tasks, which
isn't in JobClient.

Isn't it?  The ClusterStatus includes the number of running map and reduce tasks.  The number
of completed tasks can be determined from getMapTaskReports() and getReduceTaskReports(),
or incrementally with getJob().getTaskCompletionEvents().  What's missing?

> I'm also vaguely uncomfortable with the privileged access of the webapp/ to the JobTracker.

Me too.  I think the webapp should ideally use only JobClient to access the JobTracker.  Not
because I don't trust the webapp, but rather because I think that anything that's visible
in the webapp should be accessible to user programs.  Probably we should put the webapp in
a separate package.  We should also split JobTracker into a public launcher class and a package-private
class that implements various protocols, so that the protocol implementation methods are not
publicly visible, tempting folks to call them directly.  But such cleanups are separate issues.
 If you feel strongly enough, please file issues and submit patches for this.

> JobInProgress should be public (or implement a public interface)
> ----------------------------------------------------------------
>
>                 Key: HADOOP-1313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1313
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Michael Bieniosek
>
> I'm trying to get programmatic access to hadoop job/task status through the JobTracker
api.
> I notice that JobTracker returns a JobInProgress object in several public methods (runningJobs,
getJob).  However, JobInProgress is a package-access class.  So, oddly, I can get JobTracker.getJob(),
but I can't store the result as a JobInProgress (I suppose I could store it as an Object,
but then I couldn't upcast it back).  
> The JobInProgress object gives me useful information about jobs, so I don't think making
runningJobs/getJob not public is a good idea.  I get the idea from HADOOP-28 that JobInProgress
is not public because nobody wants to maintain compatibility in this class across hadoop versions.
 
> So it would probably be best if we created public interfaces that JobInProgress and TaskInProgress
implement.  I only care about the accessors, so maybe from JobInProgress we could expose (getProfile,
getStatus, get*Time, {finished,desired,running}{Maps,Reduces}, getMapTasks, getCounters) and
from TaskInProgress (isRunning, isComplete, isFailed, isMapTask, numTaskFailures, numKilledTasks,
getProgress, getCounters).
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message