hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2942) Aggregated Log Files should be combined
Date Tue, 12 May 2015 18:26:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540407#comment-14540407
] 

Jason Lowe commented on YARN-2942:
----------------------------------

bq. Can you give some more details on this? Is it something you can share?

It's a hack to help mitigate the log aggregation namespace scaling issues on our large clusters.
 Essentially its a periodic process to run an Oozie workflow that does the following:

# determines which applications are good candidates for log archiving (i.e.: lots of files
and total size is not that big)
# runs a streaming job with a shell script that uses the list of applications to aggregate
as input
# for each application it runs a local-mode archive job to archive the log contents
# when the archive has been created it swaps out the application directory with a symlink
into the har archive

The symlink makes the archive transparent to the readers.  Both the JHS and the "yarn logs"
command use FileContext and "just worked" with the symlink into the har without modifications.

So yes, we are running a MapReduce job to archive the logs which itself will create more logs.
 However it processes many application logs for each archiving job.  If there is sufficient
interest we can pursue how to share it, but the script is specific to how we configure our
nodes and clusters and relies on unsupported symlinks.  I'm hoping the outcome of this JIRA
allows us to move away from the need for it.

bq. We'd have to implement your last bullet point to have the NMs serve the logs in the meantime,
as I don't think that's there today. 

That feature is indeed there today.  Links to the app logs on the NM will try to serve the
local app logs first, then redirect to the log server if the local logs are unavailable. 
See NMController and ContainerLogsPage.  It only becomes an issue when things link to the
aggregated log server directly before the NM has finished aggregating them.

> Aggregated Log Files should be combined
> ---------------------------------------
>
>                 Key: YARN-2942
>                 URL: https://issues.apache.org/jira/browse/YARN-2942
>             Project: Hadoop YARN
>          Issue Type: New Feature
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: CombinedAggregatedLogsProposal_v3.pdf, CombinedAggregatedLogsProposal_v6.pdf,
CombinedAggregatedLogsProposal_v7.pdf, CompactedAggregatedLogsProposal_v1.pdf, CompactedAggregatedLogsProposal_v2.pdf,
ConcatableAggregatedLogsProposal_v4.pdf, ConcatableAggregatedLogsProposal_v5.pdf, YARN-2942-preliminary.001.patch,
YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, YARN-2942.003.patch
>
>
> Turning on log aggregation allows users to easily store container logs in HDFS and subsequently
view them in the YARN web UIs from a central place.  Currently, there is a separate log file
for each Node Manager.  This can be a problem for HDFS if you have a cluster with many nodes
as you’ll slowly start accumulating many (possibly small) files per YARN application.  The
current “solution” for this problem is to configure YARN (actually the JHS) to automatically
delete these files after some amount of time.  
> We should improve this by compacting the per-node aggregated log files into one log file
per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message