hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hong Tang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2023) TestDFSIO read test may not read specified bytes.
Date Fri, 20 Aug 2010 09:29:17 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900652#action_12900652
] 

Hong Tang commented on MAPREDUCE-2023:
--------------------------------------

The problem is due to the following code segments:
{code}
  public static class ReadMapper extends IOStatMapper<Long> {

    public ReadMapper() { 
    }

    public Long doIO(Reporter reporter, 
                       String name, 
                       long totalSize // in bytes
                     ) throws IOException {
      // open file
      DataInputStream in = fs.open(new Path(getDataDir(getConf()), name));
      long actualSize = 0;
      try {
        for(int curSize = bufferSize;
                curSize == bufferSize && actualSize < totalSize;) { // <-- HERE
          curSize = in.read(buffer, 0, bufferSize);
          if(curSize < 0) break;
          actualSize += curSize;
          reporter.setStatus("reading " + name + "@" + 
                             actualSize + "/" + totalSize 
                             + " ::host = " + hostName);
        }
      } finally {
        in.close();
      }
      return Long.valueOf(actualSize);
    }
  }
{code}

The problem is that the for-loop breaks out as soon as the previous read fails to fulfill
the full buffer. The fix is pretty simple:
{code}
        for(int curSize = bufferSize; actualSize < totalSize;) {
{code}

> TestDFSIO read test may not read specified bytes.
> -------------------------------------------------
>
>                 Key: MAPREDUCE-2023
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2023
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: benchmarks
>            Reporter: Hong Tang
>
> TestDFSIO's read test may read less bytes than specified when reading large files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message