hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuan Gong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2724) If an unreadable file is encountered during log aggregation then aggregated file in HDFS badly formed
Date Wed, 22 Oct 2014 21:17:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180554#comment-14180554
] 

Xuan Gong commented on YARN-2724:
---------------------------------

As [~mitdesai] mentioned, "the problem here is due to calculation of file length before even
trying to open the file. Log aggregator reads the file length of the log file that is to be
aggregated and records it. Then it tries to go and read the file contents."

For the issue reported by [~sumitmohanty], it is because of file permission. We can not aggregate
the log file.

Looking at the code
{code}
        final long fileLength = logFile.length();
        // Write the logFile Type
        out.writeUTF(logFile.getName());

        // Write the log length as UTF so that it is printable
        out.writeUTF(String.valueOf(fileLength));

        // Write the log itself
        FileInputStream in = null;
        try {
          in = SecureIOUtils.openForRead(logFile, getUser(), null);
          byte[] buf = new byte[65535];
          int len = 0;
          long bytesLeft = fileLength;
          while ((len = in.read(buf)) != -1) {
            //If buffer contents within fileLength, write
            if (len < bytesLeft) {
              out.write(buf, 0, len);
              bytesLeft-=len;
            }
            //else only write contents within fileLength, then exit early
            else {
              out.write(buf, 0, (int)bytesLeft);
              break;
            }
          }
          long newLength = logFile.length();
          if(fileLength < newLength) {
            LOG.warn("Aggregated logs truncated by approximately "+
                (newLength-fileLength) +" bytes.");
          }
          this.uploadedFiles.add(logFile);
        } catch (IOException e) {
          String message = "Error aggregating log file. Log file : "
              + logFile.getAbsolutePath() + e.getMessage();
          LOG.error(message, e);
          out.write(message.getBytes());
        } finally {
          if (in != null) {
            in.close();
          }
        }
{code}
Excluding the permission issue, there will be more issues which can cause the same problem.


> If an unreadable file is encountered during log aggregation then aggregated file in HDFS
badly formed
> -----------------------------------------------------------------------------------------------------
>
>                 Key: YARN-2724
>                 URL: https://issues.apache.org/jira/browse/YARN-2724
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: log-aggregation
>    Affects Versions: 2.5.1
>            Reporter: Sumit Mohanty
>            Assignee: Xuan Gong
>
> Look into the log output snippet. It looks like there is an issue during aggregation
when an unreadable file is encountered. Likely, this results in bad encoding.
> {noformat}
> LogType: command-13.json
> LogLength: 13934
> Log Contents:
> Error aggregating log file. Log file : /grid/0/yarn/log/application_1413865041660_0002/container_1413865041660_0002_01_000004/command-13.json/grid/0/yarn/log/application_1413865041660_0002/container_1413865041660_0002_01_000004/command-13.json
(Permission denied)command-3.json13983Error aggregating log file. Log file : /grid/0/yarn/log/application_1413865041660_0002/container_1413865041660_0002_01_000004/command-3.json/grid/0/yarn/log/application_1413865041660_0002/contaierrors-13.txt0660_0002_01_000004/command-3.json
(Permission denied)
>               errors-3.txt0gc.log-20141021044514484052014-10-21T04:45:12.046+0000: 5.134:
[GC2014-10-21T04:45:12.046+0000: 5.134: [ParNew: 163840K->15575K(184320K), 0.0488700 secs]
163840K->15575K(1028096K), 0.0492510 secs] [Times: user=0.06 sys=0.01, real=0.05 secs]
> 2014-10-21T04:45:14.939+0000: 8.027: [GC2014-10-21T04:45:14.939+0000: 8.027: [ParNew:
179415K->11865K(184320K), 0.0941310 secs] 179415K->17228K(1028096K), 0.0943140 secs]
[Times: user=0.13 sys=0.04, real=0.09 secs]
> 2014-10-21T04:46:42.099+0000: 95.187: [GC2014-10-21T04:46:42.099+0000: 95.187: [ParNew:
175705K->12802K(184320K), 0.0466420 secs] 181068K->18164K(1028096K), 0.0468490 secs]
[Times: user=0.06 sys=0.00, real=0.04 secs]
> {noformat}
> Specifically, look at the text after the exception text. There should be two more entries
for log files but none exist. This is likely due to the fact that command-13.json is expected
to be of length 13934 but its is not as the file was never read.
> I think, it should have been
> {noformat}
> LogType: command-13.json
> LogLength: <Length of the exception text>
> Log Contents:
> Error aggregating log file. Log file : /grid/0/yarn/log/application_1413865041660_0002/container_1413865041660_0002_01_000004/command-13.json/grid/0/yarn/log/application_1413865041660_0002/container_1413865041660_0002_01_000004/command-13.json
(Permission denied)command-3.json13983Error aggregating log file. Log file : /grid/0/yarn/log/application_1413865041660_0002/container_1413865041660_0002_01_000004/command-3.json/grid/0/yarn/log/application_1413865041660_0002/contaierrors-13.txt0660_0002_01_000004/command-3.json
(Permission denied)
> {noformat}
> {noformat}
> LogType: errors-3.txt
> LogLength:0
> Log Contents:
> {noformat}
> {noformat}
> LogType:gc.log
> LogLength:???
> Log Contents:
> ......-20141021044514484052014-10-21T04:45:12.046+0000: 5.134: [GC2014-10-21T04:45:12.046+0000:
5.134: [ParNew: 163840K- .......
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message