hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edwina Lu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-6913) TestDFSIO: error if file size is >= 2G for random read
Date Fri, 14 Jul 2017 21:40:00 GMT
Edwina Lu created MAPREDUCE-6913:

             Summary: TestDFSIO: error if file size is >= 2G for random read
                 Key: MAPREDUCE-6913
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6913
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: benchmarks
    Affects Versions: 2.7.4
            Reporter: Edwina Lu

For the TestDFSIO benchmark, if the test file created are 2G or larger:

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-
TestDFSIO  -Dtest.build.data=/user/edlu/DFSIO-8  -write -nrFiles 1024 -size 2GB

And TestDFSIO is run with options "-read -random":

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-
TestDFSIO -Dtest.build.data=/user/edlu/DFSIO-8  -read -random -nrFiles 1024 -size 1GB

Then the following error is raised:

17/07/14 21:20:55 INFO mapreduce.Job: Task Id : attempt_1496991431717_9344_m_000226_0, Status
Error: java.lang.IllegalArgumentException: bound must be positive
	at java.util.Random.nextInt(Random.java:388)
	at org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.nextOffset(TestDFSIO.java:615)
	at org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:594)
	at org.apache.hadoop.fs.TestDFSIO$RandomReadMapper.doIO(TestDFSIO.java:560)
	at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:134)
	at org.apache.hadoop.fs.IOMapperBase.map(IOMapperBase.java:37)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

The code is casting fileSize to int when generating a random number. It should generate a
random long instead:

     * Get next offset for reading.                                                      
     * If current < 0 then choose initial offset according to the read type.          
     * @param current offset                                                             
     * @return                                                                           
    private long nextOffset(long current) {
      if(skipSize == 0)
        return rnd.nextInt((int)(fileSize));
      if(skipSize > 0)
        return (current < 0) ? 0 : (current + bufferSize + skipSize);
      // skipSize < 0                                                                 
      return (current < 0) ? Math.max(0, fileSize - bufferSize) :
                             Math.max(0, current + skipSize);

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org

View raw message