hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2862) Infinite loop in CombineFileInputFormat#getMoreSplits(), with missing blocks
Date Mon, 22 Aug 2011 16:01:30 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088783#comment-13088783
] 

Chris Nauroth commented on MAPREDUCE-2862:
------------------------------------------

Sadayuki, thank you for submitting a patch on this.  I've been bitten by this one too.

This patch would log warnings about "corrupted files".  Is it really true that this indicates
corruption?  My experience has been that I've seen this happen when CombineFileInputFormat
tries to read newly written files that have not yet had their first block flushed.  This isn't
really corruption, so I'm wondering if logging warnings about corrupt files would give a user
the wrong impression that the cluster is suffering from corruption.

To workaround this, I've been running my jobs with a private patch of CombineFileInputFormat
that adds this to the constructor for OneFileInfo:

    // Bail out if the block has no locations.  This guards against an
    // infinite loop in getMoreSplits.  This change is not present in open
    // source Hadoop.
    if (oneblock.length <= 0)
      continue;

That prevents these blocks from ever entering the getMoreSplits logic in the first place.
 If you're interested in that approach instead, let me know, and I'll put the patch together.
 I'd still need to add a unit test for it too.

Thanks again,
--Chris


> Infinite loop in CombineFileInputFormat#getMoreSplits(), with missing blocks
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2862
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2862
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Kazuki Ohta
>         Attachments: MAPREDUCE-2862-warn-and-ignore-corrupted-blocks.patch
>
>
> Hi, we met the infinite loop on CombineFileInputFormat#getMoreSplits().
> At first, we lost some blocks by mis-operation :-(. Then, one job tried to use these
missing blocks. At that time getMoreSplits() goes into the infinite loop.
> From our investigation, this List could be an empty array.
> > https://github.com/apache/hadoop-mapreduce/blob/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java#L363
> Then 'for' loop just after that line does nothing, and entry is not removed from 'blockToNodes'.
> Finally this line goes into the infinite loop.
> > https://github.com/apache/hadoop-mapreduce/blob/trunk/src/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java#L348
> We're now creating the patch against this problem...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message