hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3628) Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
Date Tue, 02 Sep 2008 21:39:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627818#action_12627818
] 

Steve Loughran commented on HADOOP-3628:
----------------------------------------

one possibility is that you can ping() a node or a cluster to get state; and that the response
is a list of one or more nodes and their health, health which can include embedded exceptions.
so a single node health check would return the machine readeable exception for that node;
ping an aggregate cluster and you get the list of all of them. 

assuming that there are are some standard IOExceptions for hadoop-specific  states (not enough
disk space, bad versions, etc), and that everyone has synchronized class versions then this
would work. The HTML page front end would take this and generate something for people; the
raw operation would be something for machines to handle.


> Add a lifecycle interface for Hadoop components: namenodes, job clients, etc.
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-3628
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3628
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.19.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: AbstractHadoopComponent.java, hadoop-3628.patch, hadoop-3628.patch,
hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch,
hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-3628.patch, hadoop-lifecycle.pdf,
hadoop-lifecycle.sxw
>
>
> I'd like to propose we have a standard interface for hadoop components, the things that
get started or stopped when you bring up a namenode. currently, some of these classes have
a stop() or shutdown() method, with no standard name/interface, but no way of seeing if they
are live, checking their health of shutting them down reliably. Indeed, there is a tendency
for the spawned threads to not want to die; to require the entire process to be killed to
stop the workers. 
> Having a standard interface would make it easier for 
>  * management tools to manage the different things
>  * monitoring the state of things
>  * subclassing
> The latter is interesting as right now TaskTracker and JobTracker start up threads in
their constructor; that's very dangerous as subclasses may have their methods called before
they are full initialised. Adding this interface would be the right time to clean up the startup
process so that subclassing is less risky.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message