hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-502) Summer buffer overflow exception
Date Tue, 05 Sep 2006 18:51:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-502?page=comments#action_12432651 ] 
Doug Cutting commented on HADOOP-502:

To be clear, currently we ignore errors processing checksums (checksum file not found, too
short, timeouts while reading, etc.) so that the checksum system only throws user-visible
exceptions when data is known to be corrupt.  You're proposing we change this so that, if
the checksum file is there, then we may throw user-visible exceptions for errors processing
the checksum data (like unexpected eof).  Is that right, or are you proposing something else?

> Summer buffer overflow exception
> --------------------------------
>                 Key: HADOOP-502
>                 URL: http://issues.apache.org/jira/browse/HADOOP-502
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.5.0
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.0
> The extended error message with the offending values finally paid off and I was able
to get the values that were causing the Summber buffer overflow exception.
> java.lang.RuntimeException: Summer buffer overflow b.len=4096, off=0, summed=512, read=2880,
bytesPerSum=1, inSum=512
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:100)
>         at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:170)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:254)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
>         at java.io.DataInputStream.read(DataInputStream.java:80)
>         at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.copy(CopyFiles.java:190)
>         at org.apache.hadoop.util.CopyFiles$DFSCopyFilesMapper.map(CopyFiles.java:391)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:196)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1075)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>         at java.util.zip.CRC32.update(CRC32.java:43)
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.read(FSDataInputStream.java:98)
>         ... 9 more
> Tracking through the code, what happens is inside of FSDataInputStream.Checker.read()
the verifySum gets an  EOF Exception and turns off the summing. Among other things this sets
the bytesPerSum to 1. Unfortunately, that leads to the ArrayIndexOutOfBoundsException.
> I think the problem is that the original EOF exception was logged and ignored. I propose
that we allow the original EOF to propagate back to the caller. (So that file not found will
still disable the checksum checking, but we will detect truncated checksum files.)

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message