hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-3850) Container logs can be lost if disk is full
Date Thu, 25 Jun 2015 21:29:04 GMT
Varun Saxena created YARN-3850:

             Summary: Container logs can be lost if disk is full
                 Key: YARN-3850
                 URL: https://issues.apache.org/jira/browse/YARN-3850
             Project: Hadoop YARN
          Issue Type: Bug
          Components: log-aggregation
    Affects Versions: 2.7.0
            Reporter: Varun Saxena
            Assignee: Varun Saxena
            Priority: Blocker

*Container logs* can be lost if disk has become bad(become 90% full).
When application finishes, we upload logs after aggregation by calling {{AppLogAggregatorImpl#uploadLogsForContainers}}.
But this call in turns checks the eligible directories on call to {{LocalDirsHandlerService#getLogDirs}}
which in case of disk full would return nothing. So none of the container logs are aggregated
and uploaded.
But on application finish, we also call {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}.
This deletes the application directory which contains container logs. This is because it calls
{{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks as well.
So we are left with neither aggregated logs for the app nor the individual container logs
for the app.

This message was sent by Atlassian JIRA

View raw message