hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ting Dai (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-7179) The log data is corrupted
Date Fri, 08 Sep 2017 22:28:02 GMT
Ting Dai created YARN-7179:
------------------------------

             Summary: The log data is corrupted 
                 Key: YARN-7179
                 URL: https://issues.apache.org/jira/browse/YARN-7179
             Project: Hadoop YARN
          Issue Type: Bug
          Components: log-aggregation
            Reporter: Ting Dai


In hadoop-0.23.0, inside readAcontainerLogs function, when valueStream is corrupted, then
writer will be corrupted.
      while (true) {
        try {
          fileType = valueStream.readUTF();
        } catch (EOFException e) {
          return;
        }
        fileLengthStr = valueStream.readUTF();       {color:#d04437}//corrupted{color}
        fileLength = Long.parseLong(fileLengthStr); {color:#d04437}//0{color}
        writer.write("\n\nLogType:");
        writer.write(fileType);
        writer.write("\nLogLength:");
        writer.write(fileLengthStr);
        writer.write("\nLog Contents:\n");
        BoundedInputStream bis = new BoundedInputStream(valueStream, fileLength); {color:#d04437}//empty
stream{color}
        InputStreamReader reader = new InputStreamReader(bis); 
        int currentRead = 0;
        int totalRead = 0;
        while ((currentRead = reader.read(cbuf, 0, bufferSize)) != -1) {  {color:#d04437}//always
return -1{color}
          writer.write(cbuf);
          totalRead += currentRead;
        }
      }
When the fileLengthStr is corrupted, especially when it is "0", but the valueStream is actually
not empty. This will cause bis, reader will be empty due to fileLength. The empty reader causes
currentRead return -1 immediately, making writer never write.
During next iteration, the fileType, fileLength and bis, reader will be corrupted.
For example, if I have a DataInputStream like:
     "text", "0", "This is the content", "16", "Another content". 
But the writer will write the following log data:
      "LogType:text   LogLength:0  Log Contents:
      LogType:This is the content  LogLength:2  Log Contents:" 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message