hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5193) For long running services, aggregate logs when a container completes instead of when the app completes
Date Thu, 02 Jun 2016 21:37:59 GMT

    [ https://issues.apache.org/jira/browse/YARN-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15313121#comment-15313121

Siddharth Seth commented on YARN-5193:

bq. I don't think long-running necessarily means low container churn, although I'm sure it
does for the use-case you have in mind. For example, an app-as-service that farms out work
as containers on YARN and runs forever. High load with short work duration for such a service
= high container churn but it never exits.
Fair point. I'm guessing this would end up getting implemented as a parameter in the API,
rather than a blanket 'long-running=aggregate after container complete'

bq. Periodic aggregation would be more palatable for such a use-case. Also log-aggregation
duration is not guaranteed. Even if we aggregate as the container completes there's no guarantee
how long it will take, so any client that wants to see the logs in HDFS just as containers
complete has to handle fetching it from the nodes in the worst-case scenario or retrying until
it's available.
There would definitely still be the time window where the container has completed, and the
log hasn't yet been aggregated. It'll likely be a little shorter than a specific time window
- if that's worth anything.

The main problem seems to be discovering these dead containers, and where they ran. ATS/AHS
would have been ideal, but can't really be enabled on a reasonably sized cluster to log container
Maybe log-aggregation can write out indexing information up front - so that the CLI can at
least find all containers / the node where containers ran.

> For long running services, aggregate logs when a container completes instead of when
the app completes
> ------------------------------------------------------------------------------------------------------
>                 Key: YARN-5193
>                 URL: https://issues.apache.org/jira/browse/YARN-5193
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
> For a long running service, containers will typically not complete very often. However,
when a container completes - it would be useful to aggregate the logs right then, instead
of waiting for the app to complete.
> This will allow the command line log tool to lookup containers for an app from the log
file index itself, instead of having to go and talk to YARN. Talking to YARN really only works
if ATS is enabled, and YARN is configured to publish container information to ATS (That may
not always be the case - since this can overload ATS quite fast).
> There's some added benefits like cleaning out local disk space early, instead of waiting
till the app completes. (There's probably a separate jira somewhere about cleanup of container
for long running services anyway)
> cc [~vinodkv], [~xgong]

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message