hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-321) Generic application history service
Date Mon, 15 Jul 2013 23:46:49 GMT

    [ https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13709192#comment-13709192

Karthik Kambatla commented on YARN-321:

Few other considerations:

bq. Running as service: By default, ApplicationHistoryService will be embedded inside ResourceManager
but will be independent enough to run as a separate service for scaling purposes.
Is there a reason to embed this inside the RM? I don't know if there were reasons for the
JHS to be separate, other than it being MR-specific. If there were, this would be against
those. No?
That said, I agree it will be easier for the user if AHS starts along with the RM. May be,
that should be configurable and turned on by default? 

bq. Hosting/serving per-framework data is out of scope for this JIRA. 
Understand and agree it makes sense to not complicate it. However, during the design, it would
be nice to outline (at least at a high-level) how the "plugins" can work. For the plugins
to serve application-specific information, I suspect the RM should write this information
in addition to generic YARN information about that application (e.g. MapReduce counters).
On completion, can we leave a provision for the AM to write a json blob (may be, via RM) to
{{HistoryStorage}}. In the AHS, can we leave a provision for app-"plugins" to access/use this
information to render application specifics.
> Generic application history service
> -----------------------------------
>                 Key: YARN-321
>                 URL: https://issues.apache.org/jira/browse/YARN-321
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Luke Lu
>            Assignee: Vinod Kumar Vavilapalli
> The mapreduce job history server currently needs to be deployed as a trusted server in
sync with the mapreduce runtime. Every new application would need a similar application history
server. Having to deploy O(T*V) (where T is number of type of application, V is number of
version of application) trusted servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and history data
into a particular directory for later serving. Job history data is already stored as json
(or binary avro). I propose that we create only one trusted application history server, which
can have a generic UI (display json as a tree of strings) as well. Specific application/version
can deploy untrusted webapps (a la AMs) to query the application history server and interpret
the json for its specific UI and/or analytics.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message