hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Kanter (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-2942) Aggregated Log Files should be compacted
Date Wed, 11 Feb 2015 01:23:13 GMT

     [ https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Kanter updated YARN-2942:
--------------------------------
    Attachment: YARN-2942.003.patch

The YARN-2942.003.patch fixes some minor problems I found when dealing with logs for long
running applications:
- The JHS would correctly display the logs, but also show a message that they couldn't be
found
- The NM wasn't trying to compact the long running logs (which is expected), but it was dumping
an ugly error message to it's log about it.  It now checks that the "normal" aggregated log
file exists before trying to read it to prevent that.  I also made it so that it won't even
try to get the lock if it's aggregated file is not there, which is better.

> Aggregated Log Files should be compacted
> ----------------------------------------
>
>                 Key: YARN-2942
>                 URL: https://issues.apache.org/jira/browse/YARN-2942
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: CompactedAggregatedLogsProposal_v1.pdf, CompactedAggregatedLogsProposal_v2.pdf,
YARN-2942-preliminary.001.patch, YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch,
YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in HDFS and subsequently
view them in the YARN web UIs from a central place.  Currently, there is a separate log file
for each Node Manager.  This can be a problem for HDFS if you have a cluster with many nodes
as you’ll slowly start accumulating many (possibly small) files per YARN application.  The
current “solution” for this problem is to configure YARN (actually the JHS) to automatically
delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into one log file
per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message