hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ting Dai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-7179) The log data is corrupted
Date Fri, 08 Sep 2017 22:38:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-7179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ting Dai updated YARN-7179:
---------------------------
    Description: 
In hadoop-0.23.0, inside readAcontainerLogs function, when valueStream is corrupted, then
writer will be corrupted.

{code:java}
  public static void readAcontainerLogs(DataInputStream valueStream, Writer writer) throws
IOException {
      ....
      while (true) {
         try {
           fileType = valueStream.readUTF();
         } catch (EOFException e) {
           return;
         }
        fileLengthStr = valueStream.readUTF();       //corrupted
        fileLength = Long.parseLong(fileLengthStr); //0
        writer.write("\n\nLogType:");
        writer.write(fileType);
        writer.write("\nLogLength:");
        writer.write(fileLengthStr);
        writer.write("\nLog Contents:\n");
        BoundedInputStream bis = new BoundedInputStream(valueStream, fileLength); //empty
stream
        InputStreamReader reader = new InputStreamReader(bis); 
        int currentRead = 0;
        int totalRead = 0;
        while ((currentRead = reader.read(cbuf, 0, bufferSize)) != -1) {  //always return
-1
          writer.write(cbuf);
          totalRead += currentRead;
        }
      }
  }
{code}

     
When the fileLengthStr is corrupted, especially when it is "0", but the valueStream is actually
not empty. This will cause bis, reader be empty due to fileLength. The empty reader causes
currentRead be -1 immediately, making writer.write inside the while loop never execute.
During next iteration, the fileType, fileLength and bis, reader will be corrupted.
For example, if I have a DataInputStream like:
    {color:#d04437}"text", "0", "This is the content", "2", "Another content". {color}
But the writer will write the following log data:
      {color:#d04437}"LogType:text   
      LogLength:0  
      Log Contents:
      LogType:This is the content  
      LogLength:2  
      Log Contents:" {color}


  was:
In hadoop-0.23.0, inside readAcontainerLogs function, when valueStream is corrupted, then
writer will be corrupted.

{code:java}
  public static void readAcontainerLogs(DataInputStream valueStream, Writer writer) throws
IOException {
      ....
      while (true) {
         try {
           fileType = valueStream.readUTF();
         } catch (EOFException e) {
           return;
         }
        fileLengthStr = valueStream.readUTF();       //corrupted
        fileLength = Long.parseLong(fileLengthStr); //0
        writer.write("\n\nLogType:");
        writer.write(fileType);
        writer.write("\nLogLength:");
        writer.write(fileLengthStr);
        writer.write("\nLog Contents:\n");
        BoundedInputStream bis = new BoundedInputStream(valueStream, fileLength); //empty
stream
        InputStreamReader reader = new InputStreamReader(bis); 
        int currentRead = 0;
        int totalRead = 0;
        while ((currentRead = reader.read(cbuf, 0, bufferSize)) != -1) {  //always return
-1
          writer.write(cbuf);
          totalRead += currentRead;
        }
      }
  }
{code}

     
When the fileLengthStr is corrupted, especially when it is "0", but the valueStream is actually
not empty. This will cause bis, reader be empty due to fileLength. The empty reader causes
currentRead be -1 immediately, making writer.write inside the while loop never execute.
During next iteration, the fileType, fileLength and bis, reader will be corrupted.
For example, if I have a DataInputStream like:
    {color:#d04437}"text", "0", "This is the content", "16", "Another content". {color}
But the writer will write the following log data:
      {color:#d04437}"LogType:text   
      LogLength:0  
      Log Contents:
      LogType:This is the content  
      LogLength:2  
      Log Contents:" {color}



> The log data is corrupted 
> --------------------------
>
>                 Key: YARN-7179
>                 URL: https://issues.apache.org/jira/browse/YARN-7179
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: log-aggregation
>            Reporter: Ting Dai
>
> In hadoop-0.23.0, inside readAcontainerLogs function, when valueStream is corrupted,
then writer will be corrupted.
> {code:java}
>   public static void readAcontainerLogs(DataInputStream valueStream, Writer writer) throws
IOException {
>       ....
>       while (true) {
>          try {
>            fileType = valueStream.readUTF();
>          } catch (EOFException e) {
>            return;
>          }
>         fileLengthStr = valueStream.readUTF();       //corrupted
>         fileLength = Long.parseLong(fileLengthStr); //0
>         writer.write("\n\nLogType:");
>         writer.write(fileType);
>         writer.write("\nLogLength:");
>         writer.write(fileLengthStr);
>         writer.write("\nLog Contents:\n");
>         BoundedInputStream bis = new BoundedInputStream(valueStream, fileLength); //empty
stream
>         InputStreamReader reader = new InputStreamReader(bis); 
>         int currentRead = 0;
>         int totalRead = 0;
>         while ((currentRead = reader.read(cbuf, 0, bufferSize)) != -1) {  //always return
-1
>           writer.write(cbuf);
>           totalRead += currentRead;
>         }
>       }
>   }
> {code}
>      
> When the fileLengthStr is corrupted, especially when it is "0", but the valueStream is
actually not empty. This will cause bis, reader be empty due to fileLength. The empty reader
causes currentRead be -1 immediately, making writer.write inside the while loop never execute.
> During next iteration, the fileType, fileLength and bis, reader will be corrupted.
> For example, if I have a DataInputStream like:
>     {color:#d04437}"text", "0", "This is the content", "2", "Another content". {color}
> But the writer will write the following log data:
>       {color:#d04437}"LogType:text   
>       LogLength:0  
>       Log Contents:
>       LogType:This is the content  
>       LogLength:2  
>       Log Contents:" {color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message