hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mit Desai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2240) yarn logs can get corrupted if the aggregator does not have permissions to the log file it tries to read
Date Wed, 02 Jul 2014 19:14:24 GMT

    [ https://issues.apache.org/jira/browse/YARN-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050580#comment-14050580
] 

Mit Desai commented on YARN-2240:
---------------------------------

Aggregated Logs Comment

[~vinodkv], here is the error on which it fails.

{noformat}
2014-06-10 22:06:34,940 [LogAggregationService #1922] ERROR
logaggregation.AggregatedLogFormat: Error aggregating log file. Log file
:
/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001
(Permission denied)
java.io.FileNotFoundException:
/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001
(Permission denied)
         at java.io.FileInputStream.open(Native Method)
         at java.io.FileInputStream.<init>(FileInputStream.java:138)
         at
org.apache.hadoop.io.SecureIOUtils.forceSecureOpenForRead(SecureIOUtils.java:215)
         at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:204)
         at
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogValue.write(AggregatedLogFormat.java:196)
         at
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.append(AggregatedLogFormat.java:311)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:130)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:166)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:140)
         at
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:354)
         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:722)
{noformat}


I managed to get into the logs and found that the length for the logs it was reporting was
111K and the corrupted aggregated would read something like this.
The portion of the aggregated logs where there is the problem is here.
{noformat}
[...]
LogType: history.txt.appattempt_1401475649625_135179_000001
LogLength: 111686
Log Contents:
Error aggregating log file. Log file :
/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001/grid/0/tmp/yarn-logs/application_1401475649625_135179/container_1401475649625_135179_01_000001/history.txt.appattempt_1401475649625_135179_000001
(Permission
denied)stderr0!stderr_dag_1401475649625_135179_10&stderr_dag_1401475649625_135179_1_post0stdout0!stdout_dag_1401475649625_135179_10&stdout_dag_1401475649625_135179_1_post0syslog102042014-06-10
22:05:58,519 INFO [main] org.apache.tez.dag.app.DAGAppMaster: Created
DAGAppMaster for application appattempt_1401475649625_135179_000001
[...]
{noformat}

> yarn logs can get corrupted if the aggregator does not have permissions to the log file
it tries to read
> --------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-2240
>                 URL: https://issues.apache.org/jira/browse/YARN-2240
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Mit Desai
>
> When the log aggregator is aggregating the logs, it writes the file length first. Then
tries to open the log file and if it does not have permission to do that, it ends up just
writing an error message to the aggregated logs.
> The mismatch between the file length and the actual length here makes the aggregated
logs corrupted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message