hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-2779) JobSplitWriter.java can't handle large job.split file
Date Fri, 05 Aug 2011 01:34:27 GMT
JobSplitWriter.java can't handle large job.split file

                 Key: MAPREDUCE-2779
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2779
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: job submission
            Reporter: Ming Ma

We use cascading MultiInputFormat. MultiInputFormat sometimes generates big job.split used
internally by hadoop, sometimes it can go beyond 2GB.

In JobSplitWriter.java, the function that generates such file uses 32bit signed integer to
compute offset into job.split.

        int prevCount = out.size();
        int currCount = out.size();

      long offset = out.size();
      int currLen = out.size();

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message