hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2779) JobSplitWriter.java can't handle large job.split file
Date Fri, 30 Sep 2011 00:57:45 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117797#comment-13117797
] 

Hadoop QA commented on MAPREDUCE-2779:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12497098/MAPREDUCE-2779-0.22.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/903//console

This message is automatically generated.
                
> JobSplitWriter.java can't handle large job.split file
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2779
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2779
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: job submission
>    Affects Versions: 0.20.205.0, 0.22.0, 0.23.0
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-2779-0.22.patch, MAPREDUCE-2779-trunk.patch
>
>
> We use cascading MultiInputFormat. MultiInputFormat sometimes generates big job.split
used internally by hadoop, sometimes it can go beyond 2GB.
> In JobSplitWriter.java, the function that generates such file uses 32bit signed integer
to compute offset into job.split.
> writeNewSplits
> ...
>         int prevCount = out.size();
> ...
>         int currCount = out.size();
> writeOldSplits
> ...
>       long offset = out.size();
> ...
>       int currLen = out.size();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message