hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6875) New aggregated log file format for YARN log aggregation.
Date Mon, 31 Jul 2017 14:41:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107375#comment-16107375
] 

Jason Lowe commented on YARN-6875:
----------------------------------

To be clear, I'm not a fan of the current approach for partial aggregations that generates
a separate file per pass.  I think we're all in agreement that partial aggregations should
not result in multiple files after the operation completes.  I'm just proposing a way to avoid
any additional files, even transient, during partial aggregation.  We already need some kind
of marker for the metainfo block so the reader can know with certainly it has found a proper
metainfo block, otherwise the race condition I pointed out above will result in undefined
behavior for the reader.  I'm proposing we leverage this marker so we can avoid the need for
a transient index file.

bq. However, if we don't write the (temp) index file, and the approach listed in Jason's comment
will make read become very slow since it need to repeatedly find where's the last successful
write. And the worst part is, we only need to read logs when app fails or slow, it will be
likely that we will read such app logs for a couple of times. I don't think it will be a good
user experience to do this every-time.

Quite a few important points to note here:
# The read scan won't be as slow as it is today.  Today it has to decompress each block in
order to locate the next block.  The scan for the metainfo marker would not require any decompression,
just a straight read.
# The read scan today must start from the beginning of the file, so it has to read (and decompress!)
the worst-case amount of data to find logs at the end.  For the metainfo scan we only need
to scan from the end of the file to the first metainfo block we find.  That means, worst-case,
we're only going to read (without decompressing) the amount of data for the last append operation
currently in progress to locate any log in the file.
# The read scan only needs to occur when we are trying to read during an append operation.
 This will only be a repeating process if the append operation is still ongoing when we try
to do subsequent reads.

I would argue this scan is going to be much faster than you are assuming, and we only need
to perform it when there is an ongoing append.  What is the anticipated duty cycle of append
operations?  How likely will the repeated read scan scenario occur in practice, and to a point
where the scan is not fast enough?

bq. what's the percentage of apps running in your cluster which enabled partial log aggregation?

We currently do not have any partial aggregations enabled in our clusters.  The number of
additional files it creates today are one of the obstacles to creating it, but as we see longer
and longer running apps on our clusters we will eventually need a partial aggregation solution.
 Hopefully we're in agreement that no transient index file should be created during a normal
log aggregation, and we're only debating what to do for partial aggregations.

> New aggregated log file format for YARN log aggregation.
> --------------------------------------------------------
>
>                 Key: YARN-6875
>                 URL: https://issues.apache.org/jira/browse/YARN-6875
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-6875-NewLogAggregationFormat-design-doc.pdf
>
>
> T-file is the underlying log format for the aggregated logs in YARN. We have seen several
performance issues, especially for very large log files.
> We will introduce a new log format which have better performance for large log files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message